Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add Cloning #1371

Merged
merged 116 commits into from Dec 11, 2022
Merged
Show file tree
Hide file tree
Changes from 111 commits
Commits
Show all changes
116 commits
Select commit Hold shift + click to select a range
7d2a74b
Add Cloning capability
pubpub-zz Sep 26, 2022
661b6bf
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Sep 27, 2022
c6ac1e2
exclude_fields can be propagated
pubpub-zz Sep 27, 2022
f9d7d19
BUG : write reuse
pubpub-zz Sep 27, 2022
2c78419
cloning, part2
pubpub-zz Oct 5, 2022
54abc77
cloning part3
pubpub-zz Oct 9, 2022
0506ae4
Fix flake8+ "/Count"
pubpub-zz Oct 9, 2022
bd0c855
flake8
pubpub-zz Oct 9, 2022
ffc8e53
Flake 8
pubpub-zz Oct 9, 2022
a66bcc2
Sort DestNames + add page cleanup for annots
pubpub-zz Oct 11, 2022
90c95b7
flake8
pubpub-zz Oct 11, 2022
52e8bcd
mypy 1/n
pubpub-zz Oct 12, 2022
1e55376
add test for iis #471
pubpub-zz Oct 12, 2022
506f35e
flake8
pubpub-zz Oct 13, 2022
2abe7e9
mypy
pubpub-zz Oct 13, 2022
6ee6859
mypy
pubpub-zz Oct 14, 2022
39e4f9f
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Oct 14, 2022
9bdde0f
B006 fix 1
pubpub-zz Oct 15, 2022
803becb
B006 fix 2
pubpub-zz Oct 15, 2022
1727985
mypy
pubpub-zz Oct 15, 2022
f498373
mypy
pubpub-zz Oct 15, 2022
1c60786
Martin's recommendation
pubpub-zz Oct 16, 2022
198ada8
Martin's recommendation
pubpub-zz Oct 16, 2022
f0fdd4a
Update PyPDF2/generic/_data_structures.py
pubpub-zz Oct 16, 2022
e56555d
Martin's suggestion
pubpub-zz Oct 16, 2022
b3f33f1
Martin's suggestion
pubpub-zz Oct 16, 2022
6d6094c
Martin's suggestion
pubpub-zz Oct 16, 2022
1793218
Martin's suggestion
pubpub-zz Oct 16, 2022
83a3fea
Martin's suggestion
pubpub-zz Oct 16, 2022
4e3478b
Martin's suggestion
pubpub-zz Oct 16, 2022
9b756a9
Martin's suggestion
pubpub-zz Oct 16, 2022
a9449a6
fix xylopaint
pubpub-zz Oct 16, 2022
e32f3de
doc
pubpub-zz Oct 16, 2022
73fe215
add Annotation cloning
pubpub-zz Oct 17, 2022
ee1a333
mypy
pubpub-zz Oct 17, 2022
1024177
flake8
pubpub-zz Oct 17, 2022
0994df0
add tests in test_reader
pubpub-zz Oct 18, 2022
e6c4745
add tests in test_generic for _base
pubpub-zz Oct 18, 2022
da2fe09
add tests in test_generic for _data_structure
pubpub-zz Oct 19, 2022
9c06495
clone articles
pubpub-zz Oct 21, 2022
dd56dc1
flake8
pubpub-zz Oct 21, 2022
1aa8d48
mypy
pubpub-zz Oct 21, 2022
9dde5c1
mypy
pubpub-zz Oct 21, 2022
4edaa08
clean up
pubpub-zz Oct 29, 2022
5daa82a
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Oct 29, 2022
d6efb16
create PdfWriterInterface to prevent recursive import
pubpub-zz Oct 29, 2022
e80d602
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Oct 29, 2022
223eb92
indirect_ref annotation
pubpub-zz Oct 30, 2022
505b32a
clarify annotation
pubpub-zz Oct 30, 2022
592946f
fix test_outline_missing_title
pubpub-zz Oct 30, 2022
edcd132
Merge branch 'main' into cloning
pubpub-zz Oct 30, 2022
5b85816
flake8
pubpub-zz Oct 30, 2022
0aef276
/Annots and /B exclusions +
pubpub-zz Nov 6, 2022
5260fa8
add reset_translation
pubpub-zz Nov 6, 2022
14f5d86
doc
pubpub-zz Nov 6, 2022
87eb2f2
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Nov 6, 2022
1eec0d6
flake8
pubpub-zz Nov 6, 2022
13b7a8a
mypy
pubpub-zz Nov 6, 2022
0c70ab3
add test for reset_translation and Annots and articles rejection
pubpub-zz Nov 6, 2022
064b461
test improved
pubpub-zz Nov 6, 2022
c5169dc
allow multiple insertions of same source page
pubpub-zz Nov 9, 2022
aad410b
flake8
pubpub-zz Nov 9, 2022
5eca56f
mypy
pubpub-zz Nov 10, 2022
deb4ec9
mypy 2
pubpub-zz Nov 10, 2022
e1c3ed3
Rewriting using Protocols
pubpub-zz Nov 12, 2022
863d140
flake8
pubpub-zz Nov 12, 2022
87ff49b
update iaw comments
pubpub-zz Nov 27, 2022
3029536
Merge from Main
pubpub-zz Nov 27, 2022
a05b7ed
report test added
pubpub-zz Nov 28, 2022
2a26772
flake8
pubpub-zz Nov 28, 2022
3f21862
fix test
pubpub-zz Nov 28, 2022
e8b4929
line reintroduced
pubpub-zz Nov 28, 2022
4ccfbff
Merge remote-tracking branch 'py-pdf/main' into cloning
pubpub-zz Dec 2, 2022
ef799af
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
6b03b34
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
0b86b4f
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
73326a4
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
bacaef0
Apply suggestions from code review
MartinThoma Dec 10, 2022
e0cda7b
Update PyPDF2/generic/_base.py
MartinThoma Dec 10, 2022
172ac7b
Apply suggestions from code review
MartinThoma Dec 10, 2022
c34daa2
Apply suggestions from code review
MartinThoma Dec 10, 2022
d6ac98f
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
42fe44f
Apply suggestions from code review
MartinThoma Dec 10, 2022
e107b37
Apply suggestions from code review
MartinThoma Dec 10, 2022
fdd6742
Merge branch 'main' into cloning
MartinThoma Dec 10, 2022
1f3ce70
Apply suggestions from code review
MartinThoma Dec 10, 2022
d667d18
Apply suggestions from code review
MartinThoma Dec 10, 2022
6952ae2
Apply suggestions from code review
MartinThoma Dec 10, 2022
5be40fe
Apply suggestions from code review
MartinThoma Dec 10, 2022
4367c35
Apply suggestions from code review
MartinThoma Dec 10, 2022
dad3a33
Apply suggestions from code review
MartinThoma Dec 10, 2022
8a90794
Update PyPDF2/generic/_base.py
MartinThoma Dec 10, 2022
bfb19ff
Apply suggestions from code review
MartinThoma Dec 10, 2022
f6a1208
Apply suggestions from code review
MartinThoma Dec 10, 2022
e7970c5
Apply suggestions from code review
MartinThoma Dec 10, 2022
499c217
Apply suggestions from code review
MartinThoma Dec 10, 2022
f66df12
Apply suggestions from code review
MartinThoma Dec 10, 2022
239ce02
Apply suggestions from code review
MartinThoma Dec 10, 2022
7dd34e1
Apply suggestions from code review
MartinThoma Dec 10, 2022
71dd89d
Apply suggestions from code review
MartinThoma Dec 10, 2022
2dcfe27
Apply suggestions from code review
MartinThoma Dec 10, 2022
f4b8d00
Apply suggestions from code review
MartinThoma Dec 10, 2022
396ba11
Apply suggestions from code review
MartinThoma Dec 10, 2022
78c1731
Update PyPDF2/_writer.py
MartinThoma Dec 10, 2022
2896a4c
Apply suggestions from code review
MartinThoma Dec 10, 2022
82c6f56
Update PyPDF2/_writer.py
MartinThoma Dec 10, 2022
c579403
Apply suggestions from code review
MartinThoma Dec 10, 2022
89761aa
Update PyPDF2/_writer.py
MartinThoma Dec 10, 2022
25d6a7e
Apply suggestions from code review
MartinThoma Dec 11, 2022
bb5022a
Update PyPDF2/_writer.py
MartinThoma Dec 11, 2022
dfcef5b
Update PyPDF2/_writer.py
MartinThoma Dec 11, 2022
a1dd2a9
Apply suggestions from code review
MartinThoma Dec 11, 2022
222ef91
Update PyPDF2/_writer.py
MartinThoma Dec 11, 2022
bf590a0
Merge branch 'main' into cloning
MartinThoma Dec 11, 2022
5513fa1
Apply suggestions from code review
MartinThoma Dec 11, 2022
afebcab
Merge branch 'main' into cloning
MartinThoma Dec 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions PyPDF2/_merger.py
Expand Up @@ -702,6 +702,7 @@ def add_outline_item(
title,
page_number,
parent,
None,
color,
bold,
italic,
Expand Down
8 changes: 5 additions & 3 deletions PyPDF2/_page.py
Expand Up @@ -46,6 +46,7 @@
)

from ._cmap import build_char_map, unknown_char_map
from ._protocols import PdfReaderProtocol
from ._utils import (
CompressedTransformationMatrix,
File,
Expand Down Expand Up @@ -288,16 +289,17 @@ class PageObject(DictionaryObject):
this object in its source PDF
"""

original_page: "PageObject" # very local use in writer when appending

def __init__(
self,
pdf: Optional[Any] = None, # PdfReader
pdf: Optional[PdfReaderProtocol] = None,
indirect_reference: Optional[IndirectObject] = None,
indirect_ref: Optional[IndirectObject] = None,
) -> None:
from ._reader import PdfReader

DictionaryObject.__init__(self)
self.pdf: Optional[PdfReader] = pdf
self.pdf: Optional[PdfReaderProtocol] = pdf
if indirect_ref is not None: # deprecated
warnings.warn(
"Use indirect_reference instead of indirect_ref.", DeprecationWarning
Expand Down
65 changes: 65 additions & 0 deletions PyPDF2/_protocols.py
@@ -0,0 +1,65 @@
"""Helpers for working with PDF types."""

from io import BufferedReader, BufferedWriter, BytesIO, FileIO
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union

try:
# Python 3.8+: https://peps.python.org/pep-0586
from typing import Protocol # type: ignore[attr-defined]
except ImportError:
from typing_extensions import Protocol # type: ignore[misc]

from ._utils import StrByteType


class PdfObjectProtocol(Protocol):
indirect_reference: Any

def clone(
self,
pdf_dest: Any,
force_duplicate: bool = False,
ignore_fields: Union[Tuple[str, ...], List[str], None] = (),
) -> Any:
...

def _reference_clone(self, clone: Any, pdf_dest: Any) -> Any:
...

def get_object(self) -> Optional["PdfObjectProtocol"]:
...


class PdfReaderProtocol(Protocol): # pragma: no cover
@property
def pdf_header(self) -> str:
...

@property
def strict(self) -> bool:
...

@property
def xref(self) -> Dict[int, Dict[int, Any]]:
...

@property
def pages(self) -> List[Any]:
...

def get_object(self, indirect_reference: Any) -> Optional[PdfObjectProtocol]:
...


class PdfWriterProtocol(Protocol): # pragma: no cover
_objects: List[Any]
_id_translated: Dict[int, Dict[int, int]]

def get_object(self, indirect_reference: Any) -> Optional[PdfObjectProtocol]:
...

def write(
self, stream: Union[Path, StrByteType]
) -> Tuple[bool, Union[FileIO, BytesIO, BufferedReader, BufferedWriter]]:
...
3 changes: 3 additions & 0 deletions PyPDF2/_reader.py
Expand Up @@ -972,6 +972,7 @@ def _build_outline_item(self, node: DictionaryObject) -> Optional[Destination]:
# absolute value = num. visible children
# positive = open/unfolded, negative = closed/folded
outline_item[NameObject("/Count")] = node["/Count"]
outline_item.node = node
return outline_item

@property
Expand Down Expand Up @@ -1389,6 +1390,8 @@ def cache_indirect_object(
raise PdfReadError(msg)
logger_warning(msg, __name__)
self.resolved_objects[(generation, idnum)] = obj
if obj is not None:
obj.indirect_reference = IndirectObject(idnum, generation, self)
return obj

def cacheIndirectObject(
Expand Down