Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix (v1): fix broken relative links related to --skip-next-release build option #1816

Conversation

parkas2018
Copy link
Contributor

@parkas2018 parkas2018 commented Oct 8, 2019

The problem is described in issue #1805 and it seems to have existed for a long time.
Given versioning is enabled and the repository contains versioned docs, if the build
option --skip-next-release is used when building the static site, the versioned
docs contains a lot of broken links. The generator is not able to locate the targetted
file for the relative link and so it leaves the html links as ".md" instead of ".html".

The issue appears to be isolated to website/versioned-docs/ only and only if the above
mentioned build option is used.

This change fixes this issue and allows for generating site with appropriate relative
links.

closes #1805

Motivation

I have used Docusaurus in several of my work related documentation sites. We've recently published our documentation sites and found out about this bug. Currently our production sites has lot of broken links in our "versioned docs".

I wanted to see a proper fix that will work for everyone instead of coming up with a workaround or hacking our sites post-build to address this issue.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work. Bonus points for screenshots and videos!)

The changes in this PR is relatively small. So, I applied these changes to my documentation projects where I have the bugs. Then:

  1. Generate a site using the command yarn build
  2. Serve the generated site using a webserver locally and verified that everything still works as usual and relative links in both next release and versioned docs are working
  3. Generate a site using the command yarn build --skip-next-release
  4. Repeat step 2 mentioned above

I know that V2 is still under development but I wasn't able to find any code related to this feature.

Related PRs

(If this PR adds or changes functionality, please take some time to update the docs at https://github.com/facebook/docusaurus, and link to your PR here.)

@facebook-github-bot facebook-github-bot added the CLA Signed Signed Facebook CLA label Oct 8, 2019
@docusaurus-bot
Copy link
Contributor

docusaurus-bot commented Oct 8, 2019

Deploy preview for docusaurus-2 ready!

Built with commit 8bcf10c

https://deploy-preview-1816--docusaurus-2.netlify.com

@docusaurus-bot
Copy link
Contributor

docusaurus-bot commented Oct 8, 2019

Deploy preview for docusaurus-preview ready!

Built with commit 8bcf10c

https://deploy-preview-1816--docusaurus-preview.netlify.com

@parkas2018
Copy link
Contributor Author

@endiliey - Would it be possible for you to take a quick look at this PR? Sorry for nudging you on this. Unfortunately my documentation sites are live and have lots of broken links due to this bug.

Copy link
Contributor

@endiliey endiliey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken a look on this PR previously but my impression is that this might not be the full fix so i havent got back to it. Its been only 2 days too :(

In v1.
For the very first version being cut, it will copy all the docs into versioned_docs/version-1

On subsequent version, it creates versioned docs if and only if the docs content are different. In cases where only one doc content is different, the relative resolving wont work.

@parkas2018
Copy link
Contributor Author

Thanks @endiliey for looking into it and the PR. I'll test out my change on the scenario mentioned about subsequent versioning changes.

@parkas2018 parkas2018 force-pushed the 1805-fix-relative-link-in-versioned-docs branch from 343d177 to 191044d Compare October 18, 2019 00:43
@parkas2018 parkas2018 force-pushed the 1805-fix-relative-link-in-versioned-docs branch from 191044d to 41b4fb2 Compare October 21, 2019 15:40
@parkas2018 parkas2018 changed the title fix (v1): Fix broken relative links related to --skip-next-release build option fix (v1): fix broken relative links related to --skip-next-release build option Oct 28, 2019
@parkas2018
Copy link
Contributor Author

@endiliey - Would it be possible for you to review this PR again?

Copy link
Contributor

@endiliey endiliey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I’m not confident this is the right fix, hence I am not inclined to merge this out because I am also not really sure what kind of other bugs could potentially surface.

I could potentially point out few problems:

  1. It tries to replace md with html. What if user use cleanurl
  2. What if the “next” docs have a different structure or some file are deleted

I think all of this actually comes down to “relative linking fails” bug.

I think the right fix is to create an imaginary folder structure (with all the fallback docs imaginatively being in each version)) Then relative linking resolve is very easy

@parkas2018
Copy link
Contributor Author

Thanks for reviewing this @endiliey . Regarding your concerns:

It tries to replace md with html. What if user use cleanurl

I don't think I've changed the logic that takes care of cleanurl configuration. If you take a look at the preview sites deployed from my latest commit, you'll see that the it's still working on the actual Docusaurus' site where this configuration is used. The relative links are working fine with cleanUrl setup.

What if the “next” docs have a different structure or some file are deleted

This is a valid concern, but this is actually a fundamental issue with versioning feature in V1. I believe this is one of the reasons why V2 development started (https://docusaurus.io/blog/2018/09/11/Towards-Docusaurus-2#versioning). Plus I think your concern with this is valid regardless of whether this fix is accepted or not. And that's out of the scope of this bug fix, in my opinion.

Thoughts?

@parkas2018
Copy link
Contributor Author

It tries to replace md with html. What if user use cleanurl

I don't think I've changed the logic that takes care of cleanurl configuration. If you take a look at the preview sites deployed from my latest commit, you'll see that the it's still working on the actual Docusaurus' site where this configuration is used. The relative links are working fine with cleanUrl setup.

I just realized that the preview site that is deployed does not use the build option --skip-next-release. So, yes there is probably an issue if users use cleanUrl configuration. I think that is an easy fix though. If you agree with the current approach, I can push another commit to fix that.

@parkas2018 parkas2018 force-pushed the 1805-fix-relative-link-in-versioned-docs branch from 41b4fb2 to d1e6590 Compare October 31, 2019 22:00
@parkas2018
Copy link
Contributor Author

I pushed a commit yesterday after I noticed a bug in this fix. I believe there might still be a bug in here:

} else if (fs.existsSync(targetFileAtRoot)) {
          htmlLink = resolve(docsSource, mdMatch[1]).replace('.md', '.html');
        }

I can test this out and push a fix if the current approach is acceptable.

…build option

The problem is described in issue facebook#1805 and it seems to have existed for a long time.
Given versioning is enabled and the repository contains versioned docs, if the build
option `--skip-next-release` is used when building the static site, the versioned
docs contains a lot of broken links. The generator is not able to locate the targetted
file for the relative link and so it leaves the html links as ".md" instead of ".html".

The issue appears to be isolated to `website/versioned-docs/` only and only if the above
mentioned build option is used.

This change fixes this issue and allows for generating site with appropriate relative
links.
… different

The previous commit would fix the borken relative links involving `--skip-next-release` when
the very first/initial version is created or if **all** docs were updated before creating a
version. If only a few/one doc was updated before creating a version, there would still be
broken relative links when building the site with `--skip-next-release` docs.

This commit fixes the issues mentioned above.
Previous commit didn't account for "invalid relative link" where the target file
does not exit. This commit fixes that and also makes sure that relative links are
converted even when `--skip-next-release` build option is used.
The previous commit had a bug that didn't resolve the relative link properly. Addressed
that bug and also made sure that it will work if anyone uses the `cleanUrl` configuration.
@parkas2018 parkas2018 force-pushed the 1805-fix-relative-link-in-versioned-docs branch from d1e6590 to 8bcf10c Compare November 6, 2019 16:10
@parkas2018
Copy link
Contributor Author

I pushed a commit yesterday after I noticed a bug in this fix. I believe there might still be a bug in here:

} else if (fs.existsSync(targetFileAtRoot)) {
          htmlLink = resolve(docsSource, mdMatch[1]).replace('.md', '.html');
        }

I can test this out and push a fix if the current approach is acceptable.

I believe I've resolved it in the latest commit. Please review once more. Thank you.

Copy link
Contributor

@endiliey endiliey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didnt know how did I miss the notification on this PR, but I guess the request review prompted me an email.

I still think the right fix is to create an imaginary folder structure (with all the fallback docs imaginatively being in each version. Hence I'm not confident/inclined to merge this out. Every merge means taking responsibility of it if anything broke. You might want to request for other maintainers review if needed

@parkas2018
Copy link
Contributor Author

Thanks for looking into this again.

I still think the right fix is to create an imaginary folder structure (with all the fallback docs imaginatively being in each version.

What would be the benefit with that approach?

Is there certain aspect of the current approach that is problematic?

You might want to request for other maintainers review if needed

I'm happy to work on this with your guidance. If you'd like to invite others for discussions, that's okay too 😄

@endiliey
Copy link
Contributor

endiliey commented Nov 7, 2019

The benefit is that it would close out all the issues of broken relative markdown linking, even if directory structure in the “next” docs folder change. Finally, the only caveat in v1 fallback versioning will only be the fact that you cant have docA.md in version 1.0.0 but no docA.md in version 1.1.0 (there is no way to delete certain old document since it will fallback)

Anyway I’m working on versioning at v2 right now 😅

@parkas2018
Copy link
Contributor Author

The benefit is that it would close out all the issues of broken relative markdown linking

To be honest, I'm a little hesitant on tackling all issues in a single PR. I think that has the potential to be more error prone and difficult to trace issues. Anyways, what other relative linking issues are there? It'd help if I know the nature of those issues at first. Would you be able to point me to those issues?

close out all the issues of broken relative markdown linking, even if directory structure in the “next” docs folder change. Finally, the only caveat in v1 fallback versioning will only be the fact that you cant have docA.md in version 1.0.0 but no docA.md in version 1.1.0

Not sure if I have a mist-interpretation of how the versioning work in V1. My understanding is that version 1.1.0 gets created from the next release docs (in docs/ directory); not from an existing version docs (in website/versioned_docs/version-1.0.0 directory). So, pretty much all version docs have a dependency in the "next" docs folder change.

That's why in this PR, right now it's validating the actual file path. Even if we create an imaginary folder structure to validate the relative links, would we not need to validate whether the target file exists or not?

Anyway I’m working on versioning at v2 right now 😅

Can't wait to try it out 😃 . Is there anyone else who can look into this PR? I'm happy to continue with this PR but I think I will need some help and I understand you're busy on v2.

@endiliey
Copy link
Contributor

endiliey commented Nov 7, 2019

Not sure if I have a mist-interpretation of how the versioning work in V1. My understanding is that version 1.1.0 gets created from the next release docs (in docs/ directory); not from an existing version docs (in website/versioned_docs/version-1.0.0 directory). So, pretty much all version docs have a dependency in the "next" docs folder change.

Nah that is slightly wrong. Edit: not sure if i misunderstand your statement but anyway,

Read out https://docusaurus.io/docs/en/versioning#fallback-functionality

Only files in the docs directory and sidebar files that differ from those of the latest version will get copied each time a new version is specified. If there is no change across versions, Docusaurus will use the file from the latest version with that file.

For example, a document with the original id doc1 exists for the latest version, 1.0.0, and has the same content as the document with the id doc1 in the docs directory. When a new version 2.0.0 is created, the file for doc1 will not be copied into versioned_docs/version-2.0.0/. There will still be a page for docs/2.0.0/doc1.html, but it will use the file from version 1.0.0.

Imagine you have a fresh new docusaurus site, no versioning yet. Only one docA.md in docs folder. When you create new version 1.0.0, the docA.md will be copied to versioned_docs/version-1.0.0/docA.md. If you try to make new version 1.1.0 again, it wont copy docA.md to `versioned folder.

When you access 1.1.0 route, it's using the 1.0.0 docs. Its called fallback. Not rise back :)

@parkas2018
Copy link
Contributor Author

Imagine you have a fresh new docusaurus site, no versioning yet. Only one docA.md in docs folder. When you create new version 1.0.0, the docA.md will be copied to versioned_docs/version-1.0.0/docA.md. If you try to make new version 1.1.0 again, it wont copy docA.md to `versioned folder.

When you access 1.1.0 route, it's using the 1.0.0 docs. Its called fallback. Not rise back :)

So, in this case, before I make the new version 1.1.0, which file should be modified for updated contents? It should be the one in docs/ directory and not website/versioned_docs/version-1.0.0 directory. This way, when version 1.1.0 is created, it will have docA.md copied from docs/ directory.

Is that right?

@endiliey
Copy link
Contributor

endiliey commented Nov 8, 2019

yes

@parkas2018
Copy link
Contributor Author

Thanks. Would you be able to give me some pointers on where to start if we need to use an imaginary file/folder structure?

@yangshun
Copy link
Contributor

cc @endiliey

@parkas2018
Copy link
Contributor Author

@endiliey, @yangshun - Is there any chance you'd be able to help me out with this PR?

I'm not really sure where to start based on the last recommendation. To be honest, I'm also having difficulty seeing issues from the current changes in this PR. I believe I've addressed all the regression bugs since the first commit.

I fully support the suggestion of "right fix". But as I mentioned previously, I don't think this PR should be resolving multiple issues especially those that appear due to the architecture of versioning feature of V1. Another problem is that I couldn't find other relevant open issues that might be related to this PR. That's why I'm not sure in what other cases/scenarios my current change will not work.

Anyways, it'd be great if you could test out my current change. If I need to change the implementation logic, I need a little bit more help with pointing me to the right direction 😄

@endiliey
Copy link
Contributor

First of all thank you for taking your time sending a PR. I'd like to point out that reviewing takes time. I receive tons of notifications everyday and its more than possible I completely missed that.

I'm not really sure where to start based on the last recommendation

So ultimately you want to make sure that this

// mdToHtml is a map from a markdown file name to its html link, used to
// change relative markdown links that work on GitHub into actual site links
function mdToHtml(Metadata, siteConfig) {
const {baseUrl, docsUrl} = siteConfig;
const result = {};
Object.keys(Metadata).forEach(id => {
const metadata = Metadata[id];
if (metadata.language !== 'en' || metadata.original_id) {
return;
}
let htmlLink = baseUrl + metadata.permalink.replace('/next/', '/');
const baseDocsPart = `${baseUrl}${docsUrl ? `${docsUrl}/` : ''}`;
const i18nDocsRegex = new RegExp(`^${baseDocsPart}en/`);
const docsRegex = new RegExp(`^${baseDocsPart}`);
if (i18nDocsRegex.test(htmlLink)) {
htmlLink = htmlLink.replace(i18nDocsRegex, `${baseDocsPart}en/VERSION/`);
} else {
htmlLink = htmlLink.replace(docsRegex, `${baseDocsPart}VERSION/`);
}
result[metadata.source] = htmlLink;
});
return result;
}
will work for simple site, versioned site, translated site and translated + versioned site. (yes, docusaurus is that complicated)

To be honest, I'm also having difficulty seeing issues from the current changes in this PR.

I could point out another issue from this PR, even from a glance.

  • It won't work on windows, see how you resolve path finding the original file ? It assumes UNIX path which is wrong

The fact that you changed below line is potentially a source of bug because the lines of code below no longer share same assumption

- if (metadata.language !== 'en' || metadata.original_id) {
+ if (metadata.language !== 'en') {

Take the next line for example.

let htmlLink = baseUrl + metadata.permalink.replace('/next/', '/');

If one of your docs is named next/super.md. Your permalink for v1.0.0 versioned docs could be /docs/1.0.0/next/super.html.
That's gonna be a misnomer,
/docs/1.0.0/next/super.html -> /docs/1.0.0/super.html (which is wrong)

you need to change the whole logic below of it to ensure its not buggy

I don't know what kind of other bug it can introduce. That's why I wont prefer a PR that only solve a small issue but is introducing another issue.

Take #1869 that looks harmless for example, its causing another bug now.

Anyways, it'd be great if you could test out my current change. If I need to change the implementation logic, I need a little bit more help with pointing me to the right direction

Testing out change takes time. I'd prefer you write tests for it and I will be confident to merge it out. If i have to point out every several issues that might arise, its just gonna take my time which is better spent on v2

@parkas2018
Copy link
Contributor Author

Thanks @endiliey for your feedback and time. I really appreciate it. I will go through your suggestions and see if I can come up with a different approach.

Question on the following change when you get a chance:

- if (metadata.language !== 'en' || metadata.original_id) {
+ if (metadata.language !== 'en') {

Why is the above change incorrect or should remain as-is? Currently the mdToHtml function won't do anything for the versioned docs. It'll always be processing the "next release" docs.

https://github.com/facebook/docusaurus/blob/master/packages/docusaurus-1.x/lib/server/metadataUtils.js#L66-L76

If one of your docs is named next/super.md. Your permalink for v1.0.0 versioned docs could be /docs/1.0.0/next/super.html.
That's gonna be a misnomer,
/docs/1.0.0/next/super.html -> /docs/1.0.0/super.html (which is wrong)

Yes, I agree with you on this one. But, I think this is not a new issue. I believe the way versioning works in V1 at the moment, you can't have documents in a directory named next. I haven't had the time to test it out yet, but that's what I can see happening here:

https://github.com/facebook/docusaurus/blob/master/packages/docusaurus-1.x/lib/server/metadataUtils.js#L76

It's also kind of impractical in my opinion because that's like saying, what if a user decided to name one of the directories using language code. For example, has docs under docs/en/super.md. There might be a valid use case for this even though I can't think of any right now. But, this will also fail right now under the current versioning, as far as I can tell.

@endiliey
Copy link
Contributor

endiliey commented Nov 22, 2019

Yeah im not saying it should remain as it is. But just pointing it out that if want to fix that, might as well fix that part. A right fix is better than a partial fix.

Edit: also my point is that if you edit below, you need to edit all of below as well

- if (metadata.language !== 'en' || metadata.original_id) {
+ if (metadata.language !== 'en') {

htmlLink = htmlLink.replace(i18nDocsRegex, `${baseDocsPart}en/VERSION/`);
} else {
htmlLink = htmlLink.replace(docsRegex, `${baseDocsPart}VERSION/`);

@endiliey
Copy link
Contributor

What im saying is the right fix is to revamp the mdToHtml such that we can create the mapping correctly that doesnt depends on next docs

@parkas2018
Copy link
Contributor Author

I understand. To me it looks like the right fix would require changing how versioning works in V1. But that's what V2 is for 😄

@parkas2018
Copy link
Contributor Author

@endiliey - I didn't get a chance to look at this in last few days. I wanted to get your feedback before diving into any code change.

Right now, Docusaurus generates a "metadata" file here: https://github.com/facebook/docusaurus/blob/master/packages/docusaurus-1.x/lib/server/generate.js#L57-L60

This generated file is then fed into mdToHtml for it to generate the html files and also convert their contents of .md to .html

https://github.com/facebook/docusaurus/blob/master/packages/docusaurus-1.x/lib/server/generate.js#L57-L60

I still think the right fix is to create an imaginary folder structure (with all the fallback docs imaginatively being in each version.

You suggested generating a virtual directory structure and then resolve relative links. How would you go about generating this directory structure? We can't use the generated "metadata" file because it will not contain the "next release" docs. And in that case, we won't be able to resolve relative links that depends on files that are in the "next release" because those files were not copied over at the time of creation of the version.

Would you be able to clarify how you would create the imaginary folder structure with all the fallback docs being in each version?

@endiliey
Copy link
Contributor

endiliey commented Dec 4, 2019

We can't use the generated "metadata" file because it will not contain the "next release" docs.

First of all, let's make sure we're on the same page.

Imagine that this is your first new site. We have two docs, foo/bar.md and hello.md
image

When we cut our first version 1.0.0, this is whats happening. It creates the very first fallback docs
image

Note that Docusaurus v1 creates versioned docs if and only if the doc content is different.

So, When we cut a new version 1.1.0 and if the only doc changed from v1.0.0 to v1.1.0 is hello.md.
The website structure is like this
image

So the idea of generating imaginary folder structure is to actually make it such that the missing files version-1.1.0/foo/bar.md exist. (but only the metadata)
image

Does it matter if the next release docs is deleted ?

Don't you think we can resolve the relative linking in old version even without the next release docs if we have an imaginary folder structure like this ?
image

I wanted to highlight that Fallback shoudlnt depends on next release docs. Current v1 implementation was wrong, but its mainly because the first implementation never took into account *skip-next-release option which was recently added. (Hence why we regretted accepting that feature in when we're already in maintenance mode and it has caused lot of problems already)
But well, we make mistakes 😉 and software isn't perfect.

Another reason why I'm reluctant to quickly accept fixes is because there's no typing in v1 and not many tests are in place. Even if I were to accept the PR, I'd need someone else to approve too.

To be honest, I know how to fix this and most of the problems v1 had, but will it be worth it ? It's definitely going to take a lot of time too. We're getting closer to v2 beta anyway (which is a rewrite and written in TypeScript), has 100% docs test coverage.

@parkas2018
Copy link
Contributor Author

@endiliey - Thanks for the detailed example and clarifying that fallback shouldn't depend on next release docs.

Should I continue to look for a fix for this? If it's decided that certain issues in V1 will not be fixed, then it's probably going to save some time for both of us 😄

@endiliey
Copy link
Contributor

endiliey commented Dec 5, 2019

If you can send a fix like above, I will gladly accept it 😆. But i do need someone’s else approval too.

But I think you are right, we can both save our time :). Why not try in v2 or contribute to it 😀

@endiliey
Copy link
Contributor

Lets close this up for now assuming you're not working on it. Thanks for the attempt btw 😉

@endiliey endiliey closed this Dec 15, 2019
@parkas2018
Copy link
Contributor Author

Unfortunately I got a little busy with other work and didn't get a chance to come back to this again. I can try this again, if you'd like.

I looked into the version fallback briefly and realized that has info on all of the versioned doc's and their corresponding source. I think that can be used to validate if the target link exists in that fallback source. What do you think?

@parkas2018
Copy link
Contributor Author

@endiliey - I decided to take another attempt at it. Since this PR is already closed, please take a look at the changes in here:

https://github.com/facebook/docusaurus/compare/master...parvezakkas:1805-fix-relative-link-in-versioned-docs?diff=split

I think this is a simpler fix compared to my previous attempts and it doesn't rely on accessing files to validate links. I believe it also fixes #1774

If you're interested in reviewing this change further, I'll open another PR. Thank you for your time.

@parkas2018
Copy link
Contributor Author

@endiliey - Just checking if you can look review the above change. I think the current change aligns with the Version Fallback mechanism now.

BTW - Happy new year. Hope you've had a good holiday.

@parkas2018
Copy link
Contributor Author

@yangshun - I learned the sad news about Endi leaving us through the recent release of V1. He had been guiding me on this PR for "right fix". Initially I had some misunderstanding about how the version fallback worked.

I tried to find a proper fix but wasn't quick enough to be able to get Endi's feedback. Is this something you can look into? Should I open a separate PR or are you able to re-open this?

@yangshun
Copy link
Contributor

yangshun commented Jan 18, 2020 via email

@parkas2018
Copy link
Contributor Author

Thanks for your reply @yangshun

This isn't actually anything new for v1; it's a bug fix. It would probably make sense to freeze changes in v1, if v2 was production ready but that's not the case as of this moment.

Anyways, ultimately it's your decision. Thanks again for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Signed Facebook CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relative docs links are broken in "versioned" docs when building with "skip-next-release"
5 participants