# PyPDF2 Compatibility Patch ## Overview This patch addresses the PyPDF2 deprecation error that occurs when using PyPDF2 version 3.0.0 or higher with Odoo. The original error was: ``` PyPDF2.errors.DeprecationError: PdfFileWriter is deprecated and was removed in PyPDF2 3.0.0. Use PdfWriter instead. ``` ## Problem In PyPDF2 3.0.0, several classes and methods were deprecated and removed: - `PdfFileWriter` → `PdfWriter` - `PdfFileReader` → `PdfReader` - `addPage()` → `add_page()` - `addMetadata()` → `add_metadata()` - `getNumPages()` → `len(pages)` - `getPage(n)` → `pages[n]` - `appendPagesFromReader()` → `append_pages_from_reader()` - `_addObject()` → `_add_object()` - `cloneReaderDocumentRoot()` → `clone_reader_document_root()` - `setData()` → `set_data()` (for `DecodedStreamObject`) - `getData()` → `get_data()` (for `StreamObject` and `DecodedStreamObject`) - `getObject()` → `get_object()` (for `IndirectObject`) ## Solution This patch provides backward compatibility by using two complementary approaches: ### 1. Wrapper Classes Create wrapper classes that: - Inherit from the new PyPDF2 classes (`PdfWriter`, `PdfReader`) - Provide the old method signatures as compatibility methods - Gracefully handle both old and new PyPDF2 versions ### 2. Monkey-Patching (Critical for PyPDF2 3.x) In PyPDF2 3.0+, deprecated methods still exist but raise `DeprecationError`. We must: - **Force override** deprecated methods at the base class level (`PyPDF2.generic._base`) - Override methods like `getObject()`, `getData()`, `setData()` to call their new equivalents - Apply patches BEFORE any PyPDF2 objects are created - Patch both in `_base` module and `generic` module for complete coverage **Critical Note**: Simply adding methods doesn't work in PyPDF2 3.x because the old methods exist and throw errors. We must **replace** them. ## Files Modified ### 1. `odoo/tools/pdf.py` - Added compatibility wrapper classes `PdfFileWriter` and `PdfFileReader` - Added compatibility wrapper class `DecodedStreamObject` for `setData()` and `getData()` methods - **Added force-override monkey-patches for:** - `IndirectObject.getObject()` → calls `get_object()` - `StreamObject.getData()` → calls `get_data()` - Applied at both `PyPDF2.generic._base` and `PyPDF2.generic` levels - Updated import logic to handle both PyPDF2 2.x and 3.x - Added method aliases for deprecated methods - Updated `BrandedFileWriter` class to use new API with fallback ### 2. `odoo/addons/base/models/ir_actions_report.py` - Added compatibility import logic - Created local compatibility classes with required method aliases - Added support for `numPages` property and related methods - **Added force-override monkey-patches for:** - `IndirectObject.getObject()` → calls `get_object()` - `StreamObject.getData()` → calls `get_data()` - `DecodedStreamObject.getData()` → calls `get_data()` - Applied at both `PyPDF2.generic._base` and `PyPDF2.generic` levels ### 3. `odoo/addons/base/tests/test_pdf.py` - Added explicit page copying after `cloneReaderDocumentRoot()` calls in all test methods - This fixes the critical PyPDF2 3.x issue where only document structure is copied, not content pages ## Implementation Details ### Critical PyPDF2 3.x Fix - Page Content Copying In PyPDF2 3.x, `cloneReaderDocumentRoot()` only copies document structure, NOT content pages. This was causing 327-byte PDFs with no actual content. Modules using this method now include explicit page copying: ```python writer.cloneReaderDocumentRoot(reader) # Copy all pages from the reader to the writer (required for PyPDF2 3.x) for page_num in range(reader.getNumPages()): page = reader.getPage(page_num) writer.addPage(page) ``` ### Compatibility Import Pattern ```python try: from PyPDF2 import PdfReader, PdfWriter # Create compatibility classes class PdfFileWriter(PdfWriter): def addPage(self, page): return self.add_page(page) def addMetadata(self, metadata): return self.add_metadata(metadata) def _addObject(self, obj): return self._add_object(obj) class PdfFileReader(PdfReader): def getNumPages(self): return len(self.pages) def getPage(self, page_num): return self.pages[page_num] except ImportError: # Fallback to old API for older PyPDF2 versions from PyPDF2 import PdfFileWriter, PdfFileReader # DecodedStreamObject compatibility wrapper from PyPDF2.generic import DecodedStreamObject as _DecodedStreamObject class DecodedStreamObject(_DecodedStreamObject): """Compatibility wrapper for PyPDF2 3.x DecodedStreamObject""" def setData(self, data): """Compatibility method for set_data()""" if hasattr(self, 'set_data'): return self.set_data(data) else: return super().setData(data) def getData(self): """Compatibility method for get_data()""" if hasattr(self, 'get_data'): return self.get_data() else: return super().getData() # Monkey-patch PyPDF2 generic objects for compatibility # CRITICAL: In PyPDF2 3.x, old methods exist but raise DeprecationError # We MUST override them, not just add them try: import PyPDF2.generic._base as pdf_base # Override getObject to call get_object without deprecation warning if hasattr(pdf_base.IndirectObject, 'get_object'): def _getObject_compat(self): return self.get_object() # Force override even if getObject exists (it raises DeprecationError in 3.x) pdf_base.IndirectObject.getObject = _getObject_compat # Also patch in the generic module from PyPDF2.generic import IndirectObject if hasattr(IndirectObject, 'get_object'): IndirectObject.getObject = _getObject_compat except (ImportError, AttributeError): pass try: from PyPDF2.generic import StreamObject # Override getData to call get_data without deprecation warning if hasattr(StreamObject, 'get_data'): def _getData_compat(self): return self.get_data() # Force override even if getData exists (it raises DeprecationError in 3.x) StreamObject.getData = _getData_compat except (ImportError, AttributeError): pass ``` ### Key Points for Successful Patching 1. **Patch at Base Module Level**: Import `PyPDF2.generic._base` and patch classes there 2. **Force Override**: Don't check if method exists - always override in PyPDF2 3.x 3. **Double Patch**: Patch both `_base` module and `generic` module 4. **Early Application**: Apply patches at module import time, before any PDF objects are created 5. **Error Handling**: Use `(ImportError, AttributeError)` to handle both missing modules and attributes ### Method Compatibility Mapping | Old Method (PyPDF2 < 3.0) | New Method (PyPDF2 ≥ 3.0) | Compatibility Method | |---------------------------|---------------------------|---------------------| | `PdfFileWriter.addPage()` | `PdfWriter.add_page()` | ✅ Wrapped | | `PdfFileWriter.addMetadata()` | `PdfWriter.add_metadata()` | ✅ Wrapped | | `PdfFileWriter._addObject()` | `PdfWriter._add_object()` | ✅ Wrapped | | `PdfFileReader.getNumPages()` | `len(PdfReader.pages)` | ✅ Wrapped | | `PdfFileReader.getPage()` | `PdfReader.pages[]` | ✅ Wrapped | | `PdfFileWriter.appendPagesFromReader()` | `PdfWriter.append_pages_from_reader()` | ✅ Wrapped | | `PdfFileWriter.cloneReaderDocumentRoot()` | `PdfWriter.clone_reader_document_root()` | ✅ Wrapped | | `DecodedStreamObject.setData()` | `DecodedStreamObject.set_data()` | ✅ Wrapped | | `DecodedStreamObject.getData()` | `DecodedStreamObject.get_data()` | ✅ Wrapped | | `StreamObject.getData()` | `StreamObject.get_data()` | ✅ Monkey-patched | | `IndirectObject.getObject()` | `IndirectObject.get_object()` | ✅ Monkey-patched | ## Testing The patch has been successfully tested with: - **PyPDF2 3.0.1** (new API with deprecation errors) - PyPDF2 2.x (old API via fallback) - `OdooPdfFileWriter` instantiation - PDF generation workflows - Report generation (original error case) - PDF attachment operations (account_edi_ubl_cii module) - All deprecated method calls now work without errors ### Test Results ✅ All PyPDF2 deprecation errors resolved: - `PdfFileWriter` → Working - `PdfFileReader` → Working - `setData()` → Working - `getData()` → Working - `getObject()` → Working - PDF report generation → Working - PDF attachments → Working ## Branch Information - **Branch**: `pdfwrite` - **Based on**: Current main/master branch - **Type**: Compatibility patch - **Impact**: Backward compatible - no breaking changes ## Author - **Developer**: Ernad Husremović (hernad@bring.out.ba) - **Company**: bring.out.doo Sarajevo - **Date**: 2025-09-02 ## Related Issues This patch resolves the PyPDF2 deprecation error encountered in: - Report generation (`/report/pdf/` endpoints) - PDF merge operations - PDF attachment handling - Account EDI PDF operations ## Troubleshooting ### If you still get `DeprecationError` after applying the patch: 1. **Check Module Load Order**: Ensure `odoo/tools/pdf.py` is loaded before any PDF operations 2. **Verify Monkey-Patch Application**: The patches must be applied at module import time 3. **Check PyPDF2 Version**: Run `python3 -c "import PyPDF2; print(PyPDF2.__version__)"` 4. **Restart Server Completely**: Use a full server restart, not just a module reload 5. **Check for Multiple PyPDF2 Installations**: Ensure only one PyPDF2 version is installed ### Common Issues: **Issue**: `getObject is deprecated and was removed` - **Cause**: Monkey-patch not applied or overridden by later imports - **Solution**: Ensure patches are at module level, not inside functions **Issue**: `setData is deprecated and was removed` - **Cause**: Using original `DecodedStreamObject` instead of wrapper - **Solution**: Ensure wrapper class is used for all `DecodedStreamObject` instances **Issue**: Empty PDFs (327 bytes) - **Cause**: `cloneReaderDocumentRoot()` doesn't copy pages in PyPDF2 3.x - **Solution**: Always add explicit page copying after `cloneReaderDocumentRoot()` calls ## Future Considerations While this patch provides immediate compatibility, consider: 1. Eventually migrating to the new PyPDF2 API directly 2. Monitoring PyPDF2 changelog for future deprecations 3. Testing with future PyPDF2 versions 4. Consider migrating to `pypdf` (the successor to PyPDF2) when stable ## Installation This patch is automatically applied when using the `pdfwrite` branch. No additional installation steps required.