[bmap-tools] [PATCH 5/9] TransRead: add pbzip2 support

Artem Bityutskiy dedekind1 at gmail.com
Tue Jan 28 08:45:57 EST 2014


From: Artem Bityutskiy <artem.bityutskiy at intel.com>

This patch adds support for multi-stream bz2 files (creted with pbzip2).
Unfortunately, the standard python 2.7 'bz2' module does not support it, so we
use the 'bz2file' module from PyPI.

'bz2file' may not be present in the system, in which case we fall-back to the
standard python 2.7 'bz2' module. As a bonus, 'bz2file' is a little bit faster
than 'bz2' even for single-stream archives.

Change-Id: I7b473987a3bda0241f49da5bbdf1a8a2b3841400
Signed-off-by: Artem Bityutskiy <artem.bityutskiy at intel.com>
---
 bmaptools/TransRead.py | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/bmaptools/TransRead.py b/bmaptools/TransRead.py
index a2730e2..950ab7c 100644
--- a/bmaptools/TransRead.py
+++ b/bmaptools/TransRead.py
@@ -316,9 +316,20 @@ class TransRead(object):
             elif self.name.endswith('.bz2'):
                 import bz2
 
+                # Let's try to use the bz2file module, which is a backport from
+                # python 3.3 available in PyPI. It supports multiple streams
+                # (pbzip2) and handles handles out-of-memory issues nicely.
+                try:
+                    import bz2file
+
+                    f_obj = bz2file.BZ2File(self._f_objs[-1], 'r')
+                except ImportError:
+                    import bz2
+
+                    f_obj = _CompressedFile(self._f_objs[-1],
+                                      bz2.BZ2Decompressor().decompress, 128)
+
                 self.compression_type = 'bzip2'
-                f_obj = _CompressedFile(self._f_objs[-1],
-                                        bz2.BZ2Decompressor().decompress, 128)
                 self._f_objs.append(f_obj)
 
                 if self.name.endswith('.tar.bz2'):
-- 
1.8.3.1




More information about the Bmap-tools mailing list