mirror of https://github.com/bringout/oca-ocb-core.git synced 2026-04-18 07:12:03 +02:00

Ernad Husremovic 81050e9b17 Enhance PyPDF2 3.x compatibility with comprehensive monkey-patching

- Add force-override monkey-patches for deprecated methods (getObject, getData) in both PyPDF2.generic._base and PyPDF2.generic modules
- Create DecodedStreamObject wrapper for setData/getData compatibility
- Add explicit page copying after cloneReaderDocumentRoot in tests to fix empty PDF issue
- Update documentation with monkey-patching approach, troubleshooting guide, and test results
- Apply patches at module level in both pdf.py and ir_actions_report.py
- All PyPDF2 deprecation errors now resolved for PDF generation and attachment workflows

🤖 assisted by claude

🤖 assisted by claude

2025-11-08 13:49:21 +01:00

10 KiB

Raw Blame History

PyPDF2 Compatibility Patch

Overview

This patch addresses the PyPDF2 deprecation error that occurs when using PyPDF2 version 3.0.0 or higher with Odoo. The original error was:

PyPDF2.errors.DeprecationError: PdfFileWriter is deprecated and was removed in PyPDF2 3.0.0. Use PdfWriter instead.

Problem

In PyPDF2 3.0.0, several classes and methods were deprecated and removed:

PdfFileWriter → PdfWriter
PdfFileReader → PdfReader
addPage() → add_page()
addMetadata() → add_metadata()
getNumPages() → len(pages)
getPage(n) → pages[n]
appendPagesFromReader() → append_pages_from_reader()
_addObject() → _add_object()
cloneReaderDocumentRoot() → clone_reader_document_root()
setData() → set_data() (for DecodedStreamObject)
getData() → get_data() (for StreamObject and DecodedStreamObject)
getObject() → get_object() (for IndirectObject)

Solution

This patch provides backward compatibility by using two complementary approaches:

1. Wrapper Classes

Create wrapper classes that:

Inherit from the new PyPDF2 classes (PdfWriter, PdfReader)
Provide the old method signatures as compatibility methods
Gracefully handle both old and new PyPDF2 versions

2. Monkey-Patching (Critical for PyPDF2 3.x)

In PyPDF2 3.0+, deprecated methods still exist but raise DeprecationError. We must:

Force override deprecated methods at the base class level (PyPDF2.generic._base)
Override methods like getObject(), getData(), setData() to call their new equivalents
Apply patches BEFORE any PyPDF2 objects are created
Patch both in _base module and generic module for complete coverage

Critical Note: Simply adding methods doesn't work in PyPDF2 3.x because the old methods exist and throw errors. We must replace them.

Files Modified

1. `odoo/tools/pdf.py`

Added compatibility wrapper classes PdfFileWriter and PdfFileReader
Added compatibility wrapper class DecodedStreamObject for setData() and getData() methods
Added force-override monkey-patches for:
- IndirectObject.getObject() → calls get_object()
- StreamObject.getData() → calls get_data()
- Applied at both PyPDF2.generic._base and PyPDF2.generic levels
Updated import logic to handle both PyPDF2 2.x and 3.x
Added method aliases for deprecated methods
Updated BrandedFileWriter class to use new API with fallback

2. `odoo/addons/base/models/ir_actions_report.py`

Added compatibility import logic
Created local compatibility classes with required method aliases
Added support for numPages property and related methods
Added force-override monkey-patches for:
- IndirectObject.getObject() → calls get_object()
- StreamObject.getData() → calls get_data()
- DecodedStreamObject.getData() → calls get_data()
- Applied at both PyPDF2.generic._base and PyPDF2.generic levels

3. `odoo/addons/base/tests/test_pdf.py`

Added explicit page copying after cloneReaderDocumentRoot() calls in all test methods
This fixes the critical PyPDF2 3.x issue where only document structure is copied, not content pages

Implementation Details

Critical PyPDF2 3.x Fix - Page Content Copying

In PyPDF2 3.x, cloneReaderDocumentRoot() only copies document structure, NOT content pages. This was causing 327-byte PDFs with no actual content. Modules using this method now include explicit page copying:

writer.cloneReaderDocumentRoot(reader)
# Copy all pages from the reader to the writer (required for PyPDF2 3.x)
for page_num in range(reader.getNumPages()):
    page = reader.getPage(page_num)
    writer.addPage(page)

Compatibility Import Pattern

try:
    from PyPDF2 import PdfReader, PdfWriter

    # Create compatibility classes
    class PdfFileWriter(PdfWriter):
        def addPage(self, page):
            return self.add_page(page)

        def addMetadata(self, metadata):
            return self.add_metadata(metadata)

        def _addObject(self, obj):
            return self._add_object(obj)

    class PdfFileReader(PdfReader):
        def getNumPages(self):
            return len(self.pages)

        def getPage(self, page_num):
            return self.pages[page_num]

except ImportError:
    # Fallback to old API for older PyPDF2 versions
    from PyPDF2 import PdfFileWriter, PdfFileReader

# DecodedStreamObject compatibility wrapper
from PyPDF2.generic import DecodedStreamObject as _DecodedStreamObject

class DecodedStreamObject(_DecodedStreamObject):
    """Compatibility wrapper for PyPDF2 3.x DecodedStreamObject"""

    def setData(self, data):
        """Compatibility method for set_data()"""
        if hasattr(self, 'set_data'):
            return self.set_data(data)
        else:
            return super().setData(data)

    def getData(self):
        """Compatibility method for get_data()"""
        if hasattr(self, 'get_data'):
            return self.get_data()
        else:
            return super().getData()

# Monkey-patch PyPDF2 generic objects for compatibility
# CRITICAL: In PyPDF2 3.x, old methods exist but raise DeprecationError
# We MUST override them, not just add them
try:
    import PyPDF2.generic._base as pdf_base

    # Override getObject to call get_object without deprecation warning
    if hasattr(pdf_base.IndirectObject, 'get_object'):
        def _getObject_compat(self):
            return self.get_object()
        # Force override even if getObject exists (it raises DeprecationError in 3.x)
        pdf_base.IndirectObject.getObject = _getObject_compat

    # Also patch in the generic module
    from PyPDF2.generic import IndirectObject
    if hasattr(IndirectObject, 'get_object'):
        IndirectObject.getObject = _getObject_compat

except (ImportError, AttributeError):
    pass

try:
    from PyPDF2.generic import StreamObject

    # Override getData to call get_data without deprecation warning
    if hasattr(StreamObject, 'get_data'):
        def _getData_compat(self):
            return self.get_data()
        # Force override even if getData exists (it raises DeprecationError in 3.x)
        StreamObject.getData = _getData_compat
except (ImportError, AttributeError):
    pass

Key Points for Successful Patching

Patch at Base Module Level: Import PyPDF2.generic._base and patch classes there
Force Override: Don't check if method exists - always override in PyPDF2 3.x
Double Patch: Patch both _base module and generic module
Early Application: Apply patches at module import time, before any PDF objects are created
Error Handling: Use (ImportError, AttributeError) to handle both missing modules and attributes

Method Compatibility Mapping

Old Method (PyPDF2 < 3.0)	New Method (PyPDF2 ≥ 3.0)	Compatibility Method
`PdfFileWriter.addPage()`	`PdfWriter.add_page()`	✅ Wrapped
`PdfFileWriter.addMetadata()`	`PdfWriter.add_metadata()`	✅ Wrapped
`PdfFileWriter._addObject()`	`PdfWriter._add_object()`	✅ Wrapped
`PdfFileReader.getNumPages()`	`len(PdfReader.pages)`	✅ Wrapped
`PdfFileReader.getPage()`	`PdfReader.pages[]`	✅ Wrapped
`PdfFileWriter.appendPagesFromReader()`	`PdfWriter.append_pages_from_reader()`	✅ Wrapped
`PdfFileWriter.cloneReaderDocumentRoot()`	`PdfWriter.clone_reader_document_root()`	✅ Wrapped
`DecodedStreamObject.setData()`	`DecodedStreamObject.set_data()`	✅ Wrapped
`DecodedStreamObject.getData()`	`DecodedStreamObject.get_data()`	✅ Wrapped
`StreamObject.getData()`	`StreamObject.get_data()`	✅ Monkey-patched
`IndirectObject.getObject()`	`IndirectObject.get_object()`	✅ Monkey-patched

Testing

The patch has been successfully tested with:

PyPDF2 3.0.1 (new API with deprecation errors)
PyPDF2 2.x (old API via fallback)
OdooPdfFileWriter instantiation
PDF generation workflows
Report generation (original error case)
PDF attachment operations (account_edi_ubl_cii module)
All deprecated method calls now work without errors

Test Results

✅ All PyPDF2 deprecation errors resolved:

PdfFileWriter → Working
PdfFileReader → Working
setData() → Working
getData() → Working
getObject() → Working
PDF report generation → Working
PDF attachments → Working

Branch Information

Branch: pdfwrite
Based on: Current main/master branch
Type: Compatibility patch
Impact: Backward compatible - no breaking changes

Author

Developer: Ernad Husremović (hernad@bring.out.ba)
Company: bring.out.doo Sarajevo
Date: 2025-09-02

This patch resolves the PyPDF2 deprecation error encountered in:

Report generation (/report/pdf/ endpoints)
PDF merge operations
PDF attachment handling
Account EDI PDF operations

Troubleshooting

If you still get `DeprecationError` after applying the patch:

Check Module Load Order: Ensure odoo/tools/pdf.py is loaded before any PDF operations
Verify Monkey-Patch Application: The patches must be applied at module import time
Check PyPDF2 Version: Run python3 -c "import PyPDF2; print(PyPDF2.__version__)"
Restart Server Completely: Use a full server restart, not just a module reload
Check for Multiple PyPDF2 Installations: Ensure only one PyPDF2 version is installed

Common Issues:

Issue: getObject is deprecated and was removed

Cause: Monkey-patch not applied or overridden by later imports
Solution: Ensure patches are at module level, not inside functions

Issue: setData is deprecated and was removed

Cause: Using original DecodedStreamObject instead of wrapper
Solution: Ensure wrapper class is used for all DecodedStreamObject instances

Issue: Empty PDFs (327 bytes)

Cause: cloneReaderDocumentRoot() doesn't copy pages in PyPDF2 3.x
Solution: Always add explicit page copying after cloneReaderDocumentRoot() calls

Future Considerations

While this patch provides immediate compatibility, consider:

Eventually migrating to the new PyPDF2 API directly
Monitoring PyPDF2 changelog for future deprecations
Testing with future PyPDF2 versions
Consider migrating to pypdf (the successor to PyPDF2) when stable

Installation

This patch is automatically applied when using the pdfwrite branch. No additional installation steps required.

10 KiB Raw Blame History