Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editing after capture does not remove files referenced only by deleted elements #351

Open
iiv3 opened this issue Aug 15, 2023 · 3 comments
Labels
enhancement New feature or request someday Not planned to do in the near future, mostly due to limited resources

Comments

@iiv3
Copy link

iiv3 commented Aug 15, 2023

Capture a page with images.
Open the captured page from scrapbook sidebar.
Use "DOM Eraser" from WSB toolbar to remove (some of) the images.
Use "Save" from the WSB toolbar.

Even if the "index.html" no longer references the images, the image files remain in place. It happens with "folder" and "htz" with backend server.

If editing is done before capture, the erased image files are never stored.

However it is not always possible to remove all unwanted elements on live website who have JS code that constantly loads more and more stuff.


The original Scrapbook used to remove files that were no longer linked/used, so I kind of expected the same behavior.

You might consider this feature request.

If my observations are correct, on "Save" changes are done in place, so leaving old stuff is just side effect"

I would consider it an improvement, if on "Save" a new archive is created and the old one is moved to "recycle bin". (Might be good idea to do that at recapture too.)

@danny0838
Copy link
Owner

danny0838 commented Aug 15, 2023

No, legacy ScrapBook doesn't support this feature. Actually there has been a similar request.

It's not easy to scan all unused resources in an item, especially when taking account of JavaScript related contents, CSS images, shadow roots, erased contents (which is revertible unless explicitly deleted), etc. We probably cannot implement it in near future.

@danny0838 danny0838 added enhancement New feature or request someday Not planned to do in the near future, mostly due to limited resources labels Aug 15, 2023
@iiv3
Copy link
Author

iiv3 commented Aug 15, 2023

You don't need to scan for removed content.

You just make a new capture and add the content that is referenced. You just don't run all the conversion scripts and preserve the original meta data.

It's possible to do a new "capture page as..." of the captured and edited page, but you lose e.g. the "Source" URL field. No idea what happens with the html, but it also contains some metadata.


BTW, I wasn't talking about "Scrapbook X".
I was talking for the original "Scrapbook". It definitely removed unused files.

@danny0838
Copy link
Owner

danny0838 commented Aug 16, 2023

You don't need to scan for removed content.

You just make a new capture and add the content that is referenced. You just don't run all the conversion scripts and preserve the original meta data.

This won't be any easier, even "preserve metadata", and "replace all original item files" will require a large code work.

You can try implementing it to see if its true.

It's possible to do a new "capture page as..." of the captured and edited page, but you lose e.g. the "Source" URL field. No idea what happens with the html, but it also contains some metadata.

BTW, I wasn't talking about "Scrapbook X". I was talking for the original "Scrapbook". It definitely removed unused files.

ScrapBook X is derived from ScrapBook and we've been porting new features from ScrapBook. We are quite sure that ScrapBook doesn't have such feature. It you think I'm wrong, provide the exact ScrapBook version for further investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request someday Not planned to do in the near future, mostly due to limited resources
Projects
None yet
Development

No branches or pull requests

2 participants