Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 1 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 2 | XZ Utils To-Do List |
| 3 | =================== |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 4 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 5 | Known bugs |
| 6 | ---------- |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 7 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 8 | The test suite is too incomplete. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 9 | |
Lasse Collin | ee5ddb8 | 2010-01-31 23:41:29 +0200 | [diff] [blame] | 10 | If the memory usage limit is less than about 13 MiB, xz is unable to |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 11 | automatically scale down the compression settings enough even though |
| 12 | it would be possible by switching from BT2/BT3/BT4 match finder to |
| 13 | HC3/HC4. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 14 | |
Lasse Collin | 35b29e4 | 2009-08-27 15:23:27 +0300 | [diff] [blame] | 15 | XZ Utils compress some files significantly worse than LZMA Utils. |
| 16 | This is due to faster compression presets used by XZ Utils, and |
Lasse Collin | ce34ec4 | 2010-10-19 10:21:08 +0300 | [diff] [blame] | 17 | can often be worked around by using "xz --extreme". With some files |
| 18 | --extreme isn't enough though: it's most likely with files that |
| 19 | compress extremely well, so going from compression ratio of 0.003 |
| 20 | to 0.004 means big relative increase in the compressed file size. |
Lasse Collin | 35b29e4 | 2009-08-27 15:23:27 +0300 | [diff] [blame] | 21 | |
Lasse Collin | 5f6dddc | 2009-09-01 20:20:19 +0300 | [diff] [blame] | 22 | xz doesn't quote unprintable characters when it displays file names |
| 23 | given on the command line. |
| 24 | |
Lasse Collin | ee5ddb8 | 2010-01-31 23:41:29 +0200 | [diff] [blame] | 25 | tuklib_exit() doesn't block signals => EINTR is possible. |
| 26 | |
Lasse Collin | ce34ec4 | 2010-10-19 10:21:08 +0300 | [diff] [blame] | 27 | SIGTSTP is not handled. If xz is stopped, the estimated remaining |
| 28 | time and calculated (de)compression speed won't make sense in the |
| 29 | progress indicator (xz --verbose). |
| 30 | |
Lasse Collin | bd9cc17 | 2012-07-04 17:06:49 +0300 | [diff] [blame] | 31 | If liblzma has created threads and fork() gets called, liblzma |
| 32 | code will break in the child process unless it calls exec() and |
| 33 | doesn't touch liblzma. |
| 34 | |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 35 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 36 | Missing features |
| 37 | ---------------- |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 38 | |
Lasse Collin | 642f856 | 2014-09-14 21:02:41 +0300 | [diff] [blame] | 39 | Add support for storing metadata in .xz files. A preliminary |
| 40 | idea is to create a new Stream type for metadata. When both |
| 41 | metadata and data are wanted in the same .xz file, two or more |
| 42 | Streams would be concatenated. |
| 43 | |
| 44 | The state stored in lzma_stream should be cloneable, which would |
| 45 | be mostly useful when using a preset dictionary in LZMA2, but |
| 46 | it may have other uses too. Compare to deflateCopy() in zlib. |
| 47 | |
Lasse Collin | bd9cc17 | 2012-07-04 17:06:49 +0300 | [diff] [blame] | 48 | Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and |
| 49 | other streams that don't have an end of payload marker. |
| 50 | |
| 51 | Adjust dictionary size when the input file size is known. |
| 52 | Maybe do this only if an option is given. |
| 53 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 54 | xz doesn't support copying extended attributes, access control |
| 55 | lists etc. from source to target file. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 56 | |
Lasse Collin | 14e6ad8 | 2011-04-12 11:45:40 +0300 | [diff] [blame] | 57 | Multithreaded compression: |
| 58 | - Reduce memory usage of the current method. |
| 59 | - Implement threaded match finders. |
| 60 | - Implement pigz-style threading in LZMA2. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 61 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 62 | Multithreaded decompression |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 63 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 64 | Buffer-to-buffer coding could use less RAM (especially when |
| 65 | decompressing LZMA1 or LZMA2). |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 66 | |
Lasse Collin | ce34ec4 | 2010-10-19 10:21:08 +0300 | [diff] [blame] | 67 | I/O library is not implemented (similar to gzopen() in zlib). |
| 68 | It will be a separate library that supports uncompressed, .gz, |
| 69 | .bz2, .lzma, and .xz files. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 70 | |
Lasse Collin | 14e6ad8 | 2011-04-12 11:45:40 +0300 | [diff] [blame] | 71 | Support changing lzma_options_lzma.mode with lzma_filters_update(). |
| 72 | |
| 73 | Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at |
| 74 | Block and Stream boundaries. |
| 75 | |
Lasse Collin | ee5ddb8 | 2010-01-31 23:41:29 +0200 | [diff] [blame] | 76 | lzma_strerror() to convert lzma_ret to human readable form? |
| 77 | This is tricky, because the same error codes are used with |
Lasse Collin | ce34ec4 | 2010-10-19 10:21:08 +0300 | [diff] [blame] | 78 | slightly different meanings, and this cannot be fixed anymore. |
Lasse Collin | ee5ddb8 | 2010-01-31 23:41:29 +0200 | [diff] [blame] | 79 | |
Lasse Collin | 642f856 | 2014-09-14 21:02:41 +0300 | [diff] [blame] | 80 | Make it possible to adjust LZMA2 options in the middle of a Block |
| 81 | so that the encoding speed vs. compression ratio can be optimized |
| 82 | when the compressed data is streamed over network. |
| 83 | |
| 84 | Improved BCJ filters. The current filters are small but they aren't |
| 85 | so great when compressing binary packages that contain various file |
| 86 | types. Specifically, they make things worse if there are static |
| 87 | libraries or Linux kernel modules. The filtering could also be |
| 88 | more effective (without getting overly complex), for example, |
| 89 | streamable variant BCJ2 from 7-Zip could be implemented. |
| 90 | |
| 91 | Filter that autodetects specific data types in the input stream |
| 92 | and applies appropriate filters for the corrects parts of the input. |
| 93 | Perhaps combine this with the BCJ filter improvement point above. |
| 94 | |
| 95 | Long-range LZ77 method as a separate filter or as a new LZMA2 |
| 96 | match finder. |
| 97 | |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 98 | |
| 99 | Documentation |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 100 | ------------- |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 101 | |
Lasse Collin | 642f856 | 2014-09-14 21:02:41 +0300 | [diff] [blame] | 102 | More tutorial programs are needed for liblzma. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 103 | |
Lasse Collin | 366e436 | 2009-07-18 14:34:08 +0300 | [diff] [blame] | 104 | Document the LZMA1 and LZMA2 algorithms. |
Lasse Collin | 5d018dc | 2007-12-09 00:42:33 +0200 | [diff] [blame] | 105 | |
Lasse Collin | 642f856 | 2014-09-14 21:02:41 +0300 | [diff] [blame] | 106 | |
| 107 | Miscellaneous |
| 108 | ------------ |
| 109 | |
| 110 | Try to get the media type for .xz registered at IANA. |
| 111 | |