[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Delta algorithms: an empirical analysis

Published: 01 April 1998 Publication History

Abstract

Delta algorithms compress data by encoding one file in terms of another. This type of compression is useful in a number of situations: strong multiple versions of data, displaying differences, merging changes, distributing updates, storing backups, transmitting video sequences, and others. This article studies the performance parameters of several delta algorithms, using a benchmark of over 1,300 pairs of files taken from two successive releases of GNU software. Results indicate that modern delta compression algorithms based on Ziv-Lempel techniques significantly outperform diff, a popular but older delta compressor, in terms of compression ratio. The modern compressors also correlate better with the actual difference between files without sacrificing performance.

References

[1]
GOSLING, J.A. 1981. A redisplay algorithm. In Proceedings of the ACM SIGPLAN/SIGOA Symposium on Text Manipulation. ACM, New York, NY, 123-129.
[2]
HUNT, J. W. AND MCILLROY, M.D. 1976. An algorithm for differential file comparison. Tech. Rep. 41, AT&T Bell Laboratories, Inc., Murray Hill, NJ.
[3]
HUNT, J. W. AND SZYMANSKI, T. G. 1977. A fast algorithm for computing longest common subsequences. Commun. ACM 20, 5 (May), 350-353.
[4]
JONES, D.W. 1988. Application of splay trees to data compression. Commun. ACM 31, 8 (Aug.), 996-1007.
[5]
MCCREIGHT, E.M. 1976. A space-economical suffix tree construction algorithm. J. ACM 23, 2 (Apr.), 262-272.
[6]
MILLER, W. AND MEYERS, E.W. 1985. A file comparison program. Softw. Pract. Exper. 15, 11 (Nov.), 1025-1039.
[7]
NAKATSU, N., KAMBAYASHI, Y., AND YAJIMA, S. 1982. A longest common subsequence algorithm for similar text strings. Acta Inf. 18, 171-179.
[8]
OBST, W. 1987. Delta technique and string-to-string correction. In Proceedings of the 1st European Software Engineering Conference (ESEC '87) (Strasbourg, France, Sept. 9-11, 1987), H. Nichols and D. Simpson, Eds. Springer-Verlag, Berlin, Germany, 69-73.
[9]
ROCHKIND, M.J. 1975. The source code control system. IEEE Trans. Softw. Eng. SE-1, 4 (Dec.), 364-370.
[10]
TICHY, W.F. 1984. The string-to-string correction problem with block moves. ACM Trans. Comput. Syst. 2, 4 (Nov.), 309-321.
[11]
TICHY, W.F. 1985. RCS: A system for version control. Softw. Pract. Exper. 15, 7 (July), 637-654.
[12]
ZIV, J. AND LEMPEL, A. 1978. Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theor. IT-24, 5 (Sept.).

Cited By

View all
  • (2024)To Store or Not to Store: a graph theoretical approach for Dataset Versioning2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00049(479-493)Online publication date: 27-May-2024
  • (2022)Donag: Generating Efficient Patches and Diffs for Compressed ArchivesACM Transactions on Storage10.1145/350791918:3(1-41)Online publication date: 21-Sep-2022
  • (2022)RIBiT: Reduced Intra-flit Bit Transitions for Bufferless NoC2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC)10.1109/VLSI-SoC54400.2022.9939590(1-6)Online publication date: 3-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 7, Issue 2
April 1998
106 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/279310
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1998
Published in TOSEM Volume 7, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. benchmark
  2. delta encoding
  3. differencing

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)317
  • Downloads (Last 6 weeks)47
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)To Store or Not to Store: a graph theoretical approach for Dataset Versioning2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00049(479-493)Online publication date: 27-May-2024
  • (2022)Donag: Generating Efficient Patches and Diffs for Compressed ArchivesACM Transactions on Storage10.1145/350791918:3(1-41)Online publication date: 21-Sep-2022
  • (2022)RIBiT: Reduced Intra-flit Bit Transitions for Bufferless NoC2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC)10.1109/VLSI-SoC54400.2022.9939590(1-6)Online publication date: 3-Oct-2022
  • (2022)UFC2: User-Friendly Collaborative CloudIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.313249633:9(2163-2182)Online publication date: 1-Sep-2022
  • (2022)Delta Encoding Correction for Mobile Edge Caching in Internet of VehiclesIEEE Systems Journal10.1109/JSYST.2021.311576716:3(3805-3816)Online publication date: Sep-2022
  • (2021)Extended Base-Delta Compression Technique for On-chip Data Transfer2021 IEEE 18th India Council International Conference (INDICON)10.1109/INDICON52576.2021.9691570(1-7)Online publication date: 19-Dec-2021
  • (2021)Comparative studies on the high-performance compression of SARS-CoV-2 genome collectionsBriefings in Functional Genomics10.1093/bfgp/elab041Online publication date: 9-Dec-2021
  • (2021)Classification criteria for data deduplication methodsData Deduplication Approaches10.1016/B978-0-12-823395-5.00011-2(69-96)Online publication date: 2021
  • (2020)Lock-free collaboration support for cloud storage services with operation inference and transformationProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386694(13-28)Online publication date: 24-Feb-2020
  • (2020)Delta compression optimisation for UAV-enabled mobile edge cachingIET Communications10.1049/iet-com.2019.1214Online publication date: 17-Mar-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media