Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix margin values not working for html #2977

Merged
merged 23 commits into from Sep 14, 2021
Merged

Conversation

jeffsieu
Copy link
Contributor

Source

image

<div id="a">
  <h2>Here is some long text</h2>
  <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec vitae luctus mauris. Pellentesque aliquam turpis a ligula luctus, ac aliquet nisl viverra. Nulla augue mauris, lobortis id aliquam vitae, dignissim rhoncus augue. Duis aliquam pharetra sodales. Ut faucibus molestie odio, vel vestibulum sapien eleifend in. Nullam consequat ipsum ac magna iaculis, eget rutrum orci tempor. Proin at enim facilisis, ullamcorper urna eu, cursus arcu. Sed ac tincidunt libero. Mauris maximus libero sapien, vel dignissim orci vulputate in.</p>
  <p>More content</p>
  <p>Some more very very very long text</p>
</div>

jsPDF code

var doc = new jsPDF();
doc.html(document.getElementById('a'), {
  callback: function () {
    window.open(doc.output('bloburl')); // to debug
  },
  y: 60,
  margin: 20,
});

Output

image

Resolves #2924

@HackbrettXXX
Copy link
Collaborator

I thought about the margin/width behavior once again... the two options are basically these:
html-margins

The blue/red rectangle is the size and position of the canvas inside the PDF document. What you implemented is a), what I originally had in mind is b). But I'm starting to like option a) better, because it's behavior is easier to understand and makes the margins independent of the x/y and width options. What do you think?

@jeffsieu
Copy link
Contributor Author

@HackbrettXXX Yeah, I think a) is better as it is more predictable and allows for more fine-tuned behavior. I feel that the functionality of b) could be possibly set in a flag or something, like options.scaleToFitMargins = true

@HackbrettXXX
Copy link
Collaborator

Ok, so the behavior for the left side of the canvas is the same? If I specify, say, x=0 and margin-left=10 the canvas size is not adjusted, but instead 10px on the left are just clipped off? Same thing for the top margin/y on the first page?

src/jspdf.js Outdated Show resolved Hide resolved
@jeffsieu
Copy link
Contributor Author

@HackbrettXXX Currently the behavior is as you specified in the issue. the content will be shifted to the left by max(marginLeft, x). The first page will be offset by max(marginTop, y), subsequent pages will start from marginTop away from the top of the page.

@HackbrettXXX
Copy link
Collaborator

Ok, then I suggest we change that behavior such that the margin option has no impact on the position of the canvas but clips any overlapping content on the left and on the top on the first page.

@jeffsieu
Copy link
Contributor Author

jeffsieu commented Oct 26, 2020

@HackbrettXXX To be honest, I'm not sure how to "clip" content, will it be acceptable to simply omit the letters of text if they lay outside the margin? Instead of "clipping" the letter

@jeffsieu
Copy link
Contributor Author

@HackbrettXXX Then what about margins for the second page, if we define margins as so, won't the margins cut off any content in the top part of the second page?

@jeffsieu
Copy link
Contributor Author

jeffsieu commented Oct 27, 2020

@HackbrettXXX After some thinking on my own, I feel that margins should inset the content on their own.

Your current proposal could be useful, but I feel that this should not be the default behavior.
When I think of margins, I would use them to inset content, instead of clipping content. I think the current behavior of clipping the content on the right still makes sense though.

So, I would counter-propose the following:

  • marginLeft pushes content left; x will push content further left by that amount (i.e. the content will start from marginLeft + x)
  • marginRight clips content that exceed pageWidth - marginRight
  • marginTop causes content to be rendered starting from marginTop pixels below the top of the page, and is applicable for every page; y offsets content of the entire PDF downwards by that amount (content starts from marginTop + y for the first page, and starts from marginTop for subsequent pages)
  • marginBottom causes content to be rendered only until pageHeight - marginBottom, after which the next line is output to the top of the following page (at y position = marginTop)

This means making the following changes:

  • Shift content to the left by marginLeft + x instead of max(marginLeft, x)
  • Start content on the first page from marginTop + y instead of max(marginTop, y)

What do you think?

@HackbrettXXX
Copy link
Collaborator

After some thinking I agree with you. I think the margins and x/y properties should behave similar to HTML and CSS.

  • The equivalent of the margins would be padding in CSS terms. Why padding and not margin? Basically what we want to achieve here are insets inside of the PDF page and not some margins around our virtual canvas.
  • The equivalent of the x/y properties would then be margin-left and margin-top.

For the top/left corner HTML/CSS behaves exactly like you propose:

image

<div style="position: absolute; top: 20px; left: 300px; width: 200px; height: 200px; border: 1px solid black; padding: 10px;">
  <div style="position: absolute; top: 10px; left: 10px; width: 200px; height: 200px; border: 1px dashed gray; "></div>
  <div style="width: 200px; height: 200px; background: rgba(0,50,150,0.5); margin-left: 10px; margin-top: 10px;"></div>
</div>

The dashed rect is for illustrating the padding, the blue rect would be our virtual canvas.

On the bottom/right corner, in HTML/CSS the rect extends beyond the padding. This is different than your proposal.

However, I still think we should implement it like you propose: cut any overlapping content at pageWidth - rightMargin and move any overlapping content at the bottom to the next page starting at topMargin. Otherwise, the bottom and right margins would have no effect at all.

What's left to discuss would be if we call the property "margins" or "padding". I think both names would be fine. "Padding" would be closer to the HTML world, whereas "margin" intuitively fits better into jsPDF's world. What do you think?

@jeffsieu
Copy link
Contributor Author

jeffsieu commented Oct 27, 2020

@HackbrettXXX I think margins would be better, as margins are what we usually call it when we talk about pages. Furthermore, keeping the name of the current property will be less confusing for developers using this library as well.

@HackbrettXXX
Copy link
Collaborator

Alright. Could you implement the changes? :) Should be basically replacing all the Math.maxs by +

Copy link
Collaborator

@HackbrettXXX HackbrettXXX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a unit test to the test/specs/html.spec.js file with a reference PDF? See the contribution guide on how to do this.

…he right place

- fix getPagesByPath (needs to respect prevPageLastElemOffset, too)
- add clip path at margins for images and paths
- improve doc/typings
@HackbrettXXX
Copy link
Collaborator

I've fixed a lot of bugs and it seems like it's working quite well now. Haven't tried it with a real-world example though.

I'm planning on adding a switch between two different autopaging modes:

  • one that's optimized for text (the one that's implemented so far in this PR).
  • one, where text will just be cut in half if it runs across two pages. I think this second mode is still necessary, because the first mode only works for simple, but common cases. The first mode does not work well when we have e.g. text in multiple columns where the lines don't start at the same y position. This mode just cuts the canvas in tiles and will give better results for anything else than just plain text.

@Rickardo228
Copy link

Awesome, great work. I'm happy to try this out with a real-world example next week. Do ping me if I don't get back to you :)

# Conflicts:
#	src/modules/context2d.js
#	src/modules/html.js
#	test/specs/context2d.spec.js
#	test/specs/html.spec.js
@HackbrettXXX HackbrettXXX linked an issue Jul 19, 2021 that may be closed by this pull request
@glebov21
Copy link

Who not want to wait for merge: clone and build https://github.com/jeffsieu/jsPDF

@Toddman0430
Copy link

Guys-any idea when this will be merged in? Also, wanted to confirm that the proper setting to ensure text that continues to a subsequent page w/o cutoff is autoPaging = true on the jsPDF.context2d instance. Thanks!

@JamesDAdams
Copy link

@glebov21 your fix doesn't work

image

@HackbrettXXX
Copy link
Collaborator

@JamesDAdams did you

  • build the sources
  • pass autoPaging: 'text' to the html function?

@Rickardo228 did you get the chance to test it with real world examples?

 - update reference files from width/windowWidth PR (parallax#3203): there are no longer duplicate words
 - check in previously forgotten file
@Yabonev
Copy link

Yabonev commented Sep 9, 2021

@HackbrettXXX I couldn't get it to work from my side, can you tell me if I did anything wrong?

First, here's the code calling the .html method. I've skipped most variables as they're not the focus. I used lorem ipsum with big font size.

 const doc = await document.html(
      `<div class="p-margin-zero" style="width:${marginedWidthInPixels}px;">${remarks}</div>`,
      {
        x,
        y,
        jsPDF: document,
        autoPaging: 'text',
        html2canvas: {
          svgRendering: true,
          allowTaint: true,
          width: marginedWidthInPixels,
          imageTimeout: 0,
          backgroundColor: 'none',
          scrollY: (s.page.height + s.margins.top) * pixelsPerInch,
          scale: 0.01,
        },
      }
    );

What I did was:

  1. Clone https://github.com/jeffsieu/jsPDF locally
  2. npm install in cloned repo
  3. Reference it in my project's package.json like this "jspdf": "file:../../jsPDF",(path to folder with jsPDF's package.json)
  4. npm install jspdf(which updated my *.d.ts files)
  5. Run the code above
  6. Resulting in this page-break-test.pdf, scroll down to see the text

@HackbrettXXX
Copy link
Collaborator

@Yabonev you need to run npm run build in the cloned repo first. Otherwise the bundles aren't up to date.

@Yabonev
Copy link

Yabonev commented Sep 10, 2021

@HackbrettXXX Thanks for the help, it works.

Bug sample - sentence is sliced in the middle and written on two pages

Here's the "before" document page-break-test.pdf
If you scroll a few pages you can easily see the issue.

Above issue fixed

Here's the pdf with up to date bundles and autoPaging: 'text' page-break-test-correct.pdf

Margin with text

Here's the pdf with paging and margin page-break-test-margin-bottom.pdf

doc.html(...,
{
  autoPaging: 'text',
  margin: [0, 0, 1, 0], // I am using inches here, replace with what you're using.
}

Margin with images

Also checked bottom margin for images, it works, here's the file - image-margin-bottom.pdf

doc.html(...,
{
  autoPaging: 'text', // also works with `'slice'` and `true`
  margin: [0, 0, 1, 0], 
}

Margin with images and text - an issue here

Here's one with both text and image, for some reason, when an image gets slices, the text on the page, on which the image ends is white as the background, still selectable, you can check it out. text_and_image_margin.pdf
Tried it with smaller images, also has the same behavior, if images are clipped to the new page, text on this page is invisible.

Can someone else check this, don't have time for minimal repro?

@HackbrettXXX
Copy link
Collaborator

@Yabonev thanks for testing it. Unfortunately, I wasn't able to reproduce it by myself. This test case

  it("page break text + image with autoPaging: 'text'", async () => {
    const text = Array.from({ length: 30 })
      .map((_, i) => `ABC${i}`)
      .join(" ");

    const doc = jsPDF({ floatPrecision: 2, unit: "pt" });
    await new Promise(resolve =>
      doc.html(
        `<span>${text}</span>
<img src="" width="10" height="800">
<span>${text}</span>`,
        {
          callback: resolve,
          margin: [10, 30, 10, 30],
          autoPaging: "text"
        }
      )
    );

    const numberOfPages = doc.getNumberOfPages();
    const pageWidth = doc.internal.pageSize.getWidth();
    const pageHeight = doc.internal.pageSize.getHeight();
    for (let i = 1; i <= numberOfPages; i++) {
      doc.setPage(i);
      doc.rect(30, 10, pageWidth - 60, pageHeight - 20);
    }
    comparePdf(
      doc.output(),
      "html-margin-text-image-page-break-text.pdf",
      "html"
    );
  });

produces this output:
html-margin-text-image-page-break-text.pdf

Could you share the markup you used for your test case? I've checked your PDF file, and it seems like there are some clip paths where they shouldn't be or the clip paths aren't properly wrapped with save/restore graphics state commands.

@Yabonev
Copy link

Yabonev commented Sep 10, 2021

@HackbrettXXX I somehow lost the data while testing more. Here's a case with similar input and same output, just the images are more, but smaller.

Here's the image_text_test.txt file.

This data is coming from rich html editor, so the image is base64 encoded string, maybe this creates the problem, but I see in your test case it's the same.

Here's the result once again invisible_text_on_image_clip.pdf

@HackbrettXXX
Copy link
Collaborator

HackbrettXXX commented Sep 10, 2021

@Yabonev thanks. I still can't reproduce it though. When I diff my PDF with yours there are indeed differences in the save/restore graphics state commands, but I don't know why. I think I need a complete repro, otherwise I can only make guesses. Maybe we have different html2canvas versions?

@HackbrettXXX
Copy link
Collaborator

I think I will merge it now and create a release later. If there are still bugs, we can fix them afterwards.

@Yabonev
Copy link

Yabonev commented Sep 14, 2021

@HackbrettXXX Sorry for not responding, been focusing on other things, I also think that's okay. I will create a new issue with minimal reproduction when I have the time.

@Yabonev
Copy link

Yabonev commented Sep 14, 2021

While trying to create minimal repro, I found the issue. The text below is for those whom it may concern.

Initial problem - vertical distance between <p> tags is too big

The problem I was trying to solve is that there was too much vertical distance between my <p> tags when the text was pasted from notepad. The first file is my initial problem. The second file shows the end result after a fix.
text_padding_issue.pdf
text_padding_issue_fixed.pdf

Solution to the initial issue - set margin: 0 to all <p> tags

Using custom CSS class to remove the margin between p tags - p-margin-zero. I am using angular, so I will add it as context.

// styles.scss (global styles file for the app)
.p-margin-zero {
  p {
    margin: 0;
  }
}
const doc = new jsPDF('portrait', 'in', 'a4');
const html = `<div class="p-margin-zero" style="width:697.7066666666666px;"><p style="text-align: right;"><br></p><p><span style="font-size: 36pt;">A</span></p><p><br></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><br></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><br></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><br></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><span style="font-size: 36pt;">AAAAAAAAAAAAAAAAAAA</span></p><p><img src="`;
await doc.html(html, {
  x: 0.6,
  y: 0.9,
  jsPDF: doc,
  autoPaging: 'text',
  margin: [0, 0, 1, 0],
  html2canvas: {
    imageTimeout: 0,
    backgroundColor: 'none',
    scale: 0.01,
  },
});

New problem - content on new pages is invisible

For some reason, HTML tags were disappearing when an <img> tag was sliced from a new page. At least, that's what I thought.

Here's what it looked like - invisible_text_on_image_clip.pdf

The actual issue - margin collapsing

Seems like when you add margin:0 to <p> tags, their margins collapse, whatever that means, which bugs the whole thing. I am not sure if this is an issue with jsPDF, html2canvas or pdf in general but removing this margin collapse fixed my problem with the disappearing text, here's the result when the fix is applied - visible_text_on_image_clip.pdf

How to fix margin collapsing?

You want to "remove margin collapsing" between adjacent elements. Maybe even parent-child element.
I've personally removed the margin collapsing by putting overflow: auto on the parent of all <p> tags.
This was especially useful https://stackoverflow.com/questions/19718634/how-to-disable-margin-collapsing

.p-margin-zero {
  overflow: auto;
  p {
    margin: 0;
  }
}

@HackbrettXXX I will not be creating a new issue for this, but you might want to think if the problem is from jsPDF or somewhere else. Also thanks for the help and for merging this 🥇

@HackbrettXXX
Copy link
Collaborator

@Yabonev thanks for the repro. It is definitively an error in jsPDF and I was able to reproduce it now. I've created an issue for it: #3265.

@Yabonev
Copy link

Yabonev commented Sep 16, 2021

Update on overflow: auto

Overflow:auto fixed the collapsing margins, but it had 2 other side effects.

  1. File size increased by ~0.5 of the original file size - that's okay, I can live with that
  2. Scrolling for the pdf broker in some weird way, if you open the file with chrome you should be able to reproduce it - scroll_issue.pdf

Solution

Instead of overflow: auto I used display:flow-root for the margin-collapsing. My initial problem is fixed(the vertical distance between text), but #3265 stays. Since @HackbrettXXX said that it's with jsPDF, after the fix for 3265 is in, everything should work as expect.

but there may be an issue with overflow:auto and jsPDF.

@HackbrettXXX do you want me to try creating a minimal repro for this?

@HackbrettXXX
Copy link
Collaborator

You're running into #3137 with the scroll_issue PDF: every letter on the second page is written in its own text chunk. This increases the file size enormously and hurts viewing performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

html() does not split content right html: margins have no effect