WO2001063772A1 - Method and apparatus for optimized lossless compression using a plurality of coders - Google Patents

Method and apparatus for optimized lossless compression using a plurality of coders Download PDF

Info

Publication number
WO2001063772A1
WO2001063772A1 PCT/US2001/005722 US0105722W WO0163772A1 WO 2001063772 A1 WO2001063772 A1 WO 2001063772A1 US 0105722 W US0105722 W US 0105722W WO 0163772 A1 WO0163772 A1 WO 0163772A1
Authority
WO
WIPO (PCT)
Prior art keywords
lossless
data
coders
data stream
compression
Prior art date
Application number
PCT/US2001/005722
Other languages
French (fr)
Inventor
Igor V. Ternovskiy
Aleksandr A. Devivye
Joseph Rotenberg
Freddie Lin
Original Assignee
Physical Optics Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Physical Optics Corporation filed Critical Physical Optics Corporation
Priority to AU2001241672A priority Critical patent/AU2001241672A1/en
Priority to EP01912942A priority patent/EP1266455A4/en
Priority to JP2001562848A priority patent/JP2003524983A/en
Publication of WO2001063772A1 publication Critical patent/WO2001063772A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present invention is directed to data compression techniques and, more particularly, to a method and apparatus for selecting among different types of lossless compression coders to optimize system performance.
  • Data compression operates to minimize the number of bits used to store or transmit information and encompasses a wide array of software and hardware compression techniques. Notably, depending on the type of data to be compressed and any number of other factors, particular compression techniques can provide markedly superior performance in terms of compression ratio and coding speed.
  • data compression includes taking a stream of symbols or phrases and converting them into codes that are smaller (in bit length) than the original data.
  • Known compression techniques and algorithms can be divided into two major families including lossy and lossless.
  • Lossy data compression can be used to greatly increase data compression ratios; however, increased compression comes at the expense of a certain loss in accuracy. As a result, lossy compression typically is implemented only in those instances in which some data loss is acceptable. For example, lossy compression is used effectively used when applied to digitized voice signals and graphics images.
  • Lossless compression is a family of data compression that utilizes techniques designed to generate an exact duplicate of the input data stream after a compression/decompression cycle. This type of compression is necessary when storing database records, word processing files, etc. , where loss of information is absolutely unacceptable.
  • the present invention is directed to lossless data compression.
  • Some lossless compression algorithms use information theory to generate variable length codes when given a probability table for a given set of symbols.
  • the decision to output a certain code for a particular symbol or set of symbols is based on a model.
  • the model is a set of rules used to process input messages, and in response, determine which codes to output.
  • An algorithm or program uses the model to analyze the symbols (e.g., determine a probability associated with the symbol) and then outputs an appropriate code based on that processing.
  • a model should be selected that predicts symbols or phrases with high probabilities because symbols or messages that have a high probability have a low information content, and therefore require fewer bits to encode.
  • the next step is to encode the symbols using a particular lossless coder.
  • lossless compression coders can be grouped according to whether they implement statistical modeling or dictionary-based modeling. Statistical modeling reads and encodes a single symbol at a time using the probability of the character's appearance, while dictionary-based modeling uses a single code to replace strings of symbols.
  • dictionary-based modeling the model is significantly more important than in statistical-based modeling because problems associated with encoding every symbol are significantly reduced.
  • S-F coding One form of statistical data compression is known as Shannon-Fano (S-F) coding.
  • S-F coding was developed to provide variable-length bit coding so as to allow coding symbols with exactly (or a close approximation to) the number of bits of information that the message or symbol contains.
  • S-F coding relies on knowing the probability of each symbol's appearance in a message. After the probabilities are determined, a table of codes is constructed with each code having a different number of bits (advantageously, symbols with low probabilities have more bits).
  • One problem with a coding technique such as this is that it creates variable length codes that have an integral number of bits, even though the information to be coded likely will require a non- integral number of bits.
  • Huffman coding is similar to S-F coding in that it creates variable length codes that are an integral number of bits, but it utilizes a completely different algorithm.
  • S-F and Huffman codings are close in performance but Huffman coding, it has been determined, always at least equals the efficiency of S-F coding so it is therefore preferred, especially since both algorithms take a similar amount of processing power.
  • Huffman coding is relatively easy to implement and economical for both coding and decoding, it is inefficient due to its use of an integral number of bits per code as in S-F coding. If a particular symbol is determined to have an information content (i.e.
  • a Huffman coder will generate a code having a bit count that is either one or two bits.
  • the optimal code size would be 0.15 bits; however, Huffman or S-F coding likely would assign a one bit code to the symbol, which is six times larger than necessary.
  • Arithmetic coding replaces a stream of input symbols with a single floating point output number, and bypasses the step of replacing an input symbol with a specific code.
  • an arithmetic code is not limited to being optimal only when the symbol probabilities are integral powers of one-half (which is most often not the case), it attains the theoretical entropy of the symbol to be coded, thus maximizing compression efficiency for any known source.
  • the entropy of a given character is 1.5 bits
  • arithmetic coding uses 1.5 bits to encode the symbol, an impossibility for Huffman and Shannon-Fano coding.
  • arithmetic coding is extremely efficient, it consumes rather large amounts of computing resources, both in terms of CPU power and memory. This is due to the fact that sophisticated models that demand a significant amount of memory must be built, and that the algorithm itself requires a significant amount of computational operations.
  • dictionary-based compression algorithms replace occurrences of particular phrases (i.e., groups of bytes) in a data stream with a reference to a previous occurrence of those phrases.
  • dictionary -based algorithms do not encode single symbols. Rather, dictionary -based compression techniques encode variable length strings of symbols as single "tokens. " It is these tokens that form an index to a phrase dictionary. Because the tokens are smaller than the phrases they replace, compression occurs.
  • dictionary-based compression schemes Two main classes of dictionary -based compression schemes are known as the LZ77 and LZ78 compression algorithms of the Lempel-Ziv family of compression coders.
  • dictionary-based coding is utilized extensively in desktop general purpose compression and has been implemented by CompuServe Information Service to encode bit-mapped graphical images.
  • the GIF format uses a LZW variant to compress repeated sequences and screen images.
  • dictionary-based compression techniques are very popular forms of compression, the disadvantage of such algorithms is that a more sophisticated data structure is needed to handle the dictionary.
  • the present invention is directed to a method and apparatus that determines which of a number of embedded coding schemes will optimally compress different portions of an incoming data stream.
  • the method of the preferred embodiment is designed to accommodate a data stream characterized by having different packets of information (e.g., from sources unknown to the encoders) each of which may have different associated statistics.
  • a method of lossless compression of a stream of data includes providing a plurality of lossless coders. The method then includes selecting one of the lossless coders to compress the stream of data, and thereafter encoding the data stream with the selected lossless coder.
  • a method of lossless compression of a stream of data includes using a plurality of lossless coders to compress a test portion of the data stream. Once the test portion is compressed, the method determines a performance characteristic associated with each of the lossless coders. Then the method includes selecting one of the lossless coders based on the determining step and encoding a first portion of the data stream with the selected coder. Thereafter, the method includes repeating the using, determining, selecting and encoding steps for another test portion and a second portion of the data stream. Notably, the repeating step may include selecting a different one of the lossless coders.
  • each of the lossless coders uses (1) a compression technique, and (2) a number of bits per word determined by the selecting step, in the encoding step.
  • the compression technique is one of Arithmetic coding, Huffman coding and LZ coding.
  • an apparatus for lossless data compression includes an interface to receive a stream of data.
  • the apparatus includes a plurality of lossless coders and a processor.
  • each lossless coder separately compresses a test portion of the data stream and, in response, the processor determines a performance characteristic associated with each of the lossless coders, and then selects, based on the performance characteristics, one of the lossless coders to encode at least a first portion of the data stream.
  • the performance characteristic includes at least one of compression ratio and duration of the compression of the test portion for a corresponding lossless coder.
  • the encoder includes a plurality of processors and each of the lossless coders corresponds to one of the processors, and wherein the lossless coders compress the same test portion in parallel.
  • FIG. 1 is a flow diagram showing the general operation of a method of the preferred embodiment
  • FIG. 1A is a chart showing an array of lossless coders used in the method shown in FIG. 1;
  • FIG. 2 is a generic block diagram showing an encoding/decoding system of the preferred embodiment.
  • FIG. 3 is a schematic diagram showing the data stream as it is encoded/decoded by the system shown in FIG. 2.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Referring to Fig. 1, a method 10 includes, after initialization and start-up at Step 12, inputting data to the system at Step 14.
  • the data input at Step 14 may be synchronous or asynchronous data.
  • the data stream may be received from unspecified sources such as sensors that monitor temperature, pressure, etc. of a subject (e.g. , telemetric data gathered in military applications) and that continuously transmit readings to the system encoder (described below) of the preferred embodiment.
  • Unspecified data necessarily implies that the statistics associated with the data are random, and therefore unlike known systems that perform compression with a single type of encoder based on knowledge regarding the statistics of the data, the preferred embodiment is able to efficiently code a data stream comprised of different types of data. Other types of applications where this type of random data may originate from multiple sources include hospital monitoring applications, chemical factories, nuclear plants, and others.
  • the data is continuously input to the system, it is transmitted to a division on communication block where method 10, at Step 16, processes the data by dividing or framing the data for further communication thereof.
  • the division on communication block is implemented by method 10 in conventional fashion.
  • the data is pre-processed which may include generating a histogram indicating the statistics associated with the data framed in Step 16.
  • method 10 adds synchronization and header codes at Step 20, as required to further process and identify the data bits in the data stream.
  • the data is transmitted to a plurality of coders that provide lossless compression.
  • method 10 codes a test portion of the data stream with a plurality of lossless coders and determines system performance criteria associated with each of the coders.
  • the coders used to code the portion of the data in Step 22 are shown at 32 in chart 30 of Fig. 1A.
  • the columns of the charts indicate different types of lossless coding techniques/algorithms which may include Huffman coding, Arithmetic coding, Lempel-Ziv coding, as well as variants of these and other known coding techniques.
  • the method also compares the output of the coding techniques with the data stream without encoding/compressing because, in certain circumstances, uncompressed data may be optimum.
  • the columns comprise lossless coding techniques.
  • the rows comprise different designations for the number of bits per word, bpw 1-m, may be used to encode the data.
  • the bits per word associated with the interface may be set to be eight bits, ten bits, etc., for example.
  • method 10 codes a portion of the data with n x m number of lossless coders.
  • Step 22 is performed for a test period of time or amount of data to determine which of the lossless coders achieves optimum system performance prior. Thereafter, the data is encoded (described below).
  • the test compression performed by coders 32 in Step 22 preferably is conducted in parallel to quickly compile data corresponding to each of the lossless coders.
  • Parallel coding of the test data is possible due to the fact that computing power has become so inexpensive that the benefits (e.g., in terms of encoding speed) greatly outweigh the costs.
  • each of coders 32 shown in Fig. 1A may code the test data sequentially over a designated period of time to produce the corresponding performance data. Although not preferred, such a sequential test may be performed when computing power is at a premium.
  • the performance criteria generated in Step 22 for nine different lossless coders (three different word lengths x three different coding techniques) is shown.
  • the input bit rate is set at a predetermined value
  • the output bit rate although preferably set based on the transmission medium employed, may be continuously updated based on feedback information regarding the lossless coder employed. Optimally, the output bit rate will be made as small as possible.
  • Table 1 after compressing a test amount of data, an output file size in bytes, a compression ratio, and a time to encode are each determined for a designated speed input (kbit/second) and speed output (kbit/second).
  • Huffman coding achieves a compression ratio of 1.8272
  • Lempel-Ziv coding achieves a ratio of 2.505
  • arithmetic coding achieves a ratio of 2.7724.
  • the time to encode the test data for each of these algorithms is 128 seconds, 522 seconds, and 1,582 seconds, respectively.
  • the selection made in Step 24 typically is not based solely on compression ratio realized but rather the selection is made based on a combination of overall processing time and compression ratio performance characteristics.
  • Table 1 arithmetic coding, for eight bits per word, achieves a compression ratio of 2.7724 which is greater than the compression ratio achieved for Lempel-ZIV coding, 2.505.
  • arithmetic coding takes more than fifteen minutes longer to encode than the Lempel-Ziv lossless coder.
  • method 10 likely would select the Lempel-Ziv coder in Step 24.
  • method 10 may decide to send the data uncompressed. This decision depends on, among other things, user requirements.
  • the input clock rate indicated in Table 1 is dependent upon both the media over which the data is transmitted (internet, for example) and the type of coding algorithm implemented.
  • tprocessing includes the time duration associated with compressing the data, system delay, etc.
  • tc is the time to transmit the data and equals the size of the file divided by the compression ratio and by the output speed, i.e. , the bit rate, and reflects the time savings achieved by compressing the data.
  • the compression ratio (CR) being equal to the input file size divided by the output file size.
  • method 10 encodes the data with the selected coder for, preferably, a predetermined amount of time. Thereafter, the program returns to Step 22 to code a new test portion of the data stream and select an optimum coder for encoding the next portion of the data stream. This operation may require implementing a different lossless coder shown in Chart 30.
  • a system 40 for performing method 10 includes an encoder 41 having an input interface 42 which includes a clock input Cl and a data input Dl for receiving a data stream 43 that is either synchronous or asynchronous.
  • Interface 42 is coupled to a digital signal processor (DSP) chip 46 via input and output data-control-synchronous input/output lines 44.
  • DSP 46 preferably performs steps 16 and 18 in method 10 shown in FIG. 1 to frame the data and prepare it for compression.
  • the output of DSP 46 is coupled to computer 50 via a PCI bus 48 that communicates the framed data to the computer.
  • Computer 50 preferably, adds appropriate header codes to the data stream to indicate different packets of data and operates to encode/compress the test data with each lossless coder shown in chart 30.
  • computer 50 may comprise a plurality of processors each capable of encoding/compressing data for a corresponding one of the lossless coders implemented from the grid 30 in Fig. 1A.
  • a single computer 50 could be used to implement the test compression for each of the lossless coders 32 in a sequential fashion for predetermined period of time.
  • Computer 50 may also be used to add header codes to the data to ensure that the file will be decompressed correctly.
  • the compressed data is then transmitted via PCI bus 48 to the DSP chip 46 to divide the data as necessary for the specific communication system implemented. This process may involve buffering the data by inserting empty blocks and/or deleting existing blocks. Thereafter, particular synchronization codes may be added and the data stream is transmitted along the input/output lines back to interface 42.
  • the particular interface code settings include designating the number of bits/word, the number of words per frame, synchronizing codes, a control sum, etc.
  • Interface 42 then outputs the data stream on line D2, so that it may be transmitted over a medium 52 such as the internet.
  • the output clock rate C2 is set by the operator and is dependent upon the type of medium 52 implemented.
  • the decoder 53 of system 40 includes an interface 54 having a data input D3 for receiving the compressed data from medium 52 at a clock rate C3 that corresponds to clock rate C2 output from interface 42. Notably, clocks C2 and C3 are optional.
  • Interface 54 transmits the compressed data stream via data-control- synchronous lines 56, while the POC synchronization added by encoder 41 is deleted.
  • a DSP chip 58 detects the header code(s) and removes empty blocks from the data stream.
  • the data processed by DSP chip 58 is then transmitted to a computer 62 via a PCI bus 60.
  • Computer 62 decompresses the data, and, preferably, implements conventional control sum check (CSC) comparison techniques.
  • CSC control sum check
  • Additional error detection or error correction coders may also be implemented by computer 62.
  • a Reed-Solomon error correction coder is standard for communication networks and is preferably included. Notably, the above- described processing operations may be performed either by computer(s) 50, 62 or DSP chips 46, 48, but the preferred implementation has been described.
  • FIG. 3 A representation of the method steps described in FIG. 1 and performed by the apparatus shown in Fig. 2 is shown schematically in FIG. 3 for a telemetric data stream.
  • the arrow labeled A on the right-side of FIG. 3 indicates the encoding process, while the arrow B along the left-side of the data shown in FIG. 3 indicates the decoding process.
  • data stream 43 is input to interface 42 (FIG. 2) and then framed in a preassigned fashion into packet portions 64, 66 (preferably several kilobytes, e.g., two 8k portions), preferably by DSP 46. Thereafter, portions 64, 66 are compressed into, for example, a 4.5k block 68 and a 4.3k block 70.
  • a header 73, 75 is added to the blocks of the packets (with the interface information described above) to create blocks 72, 74, respectively. Then, the blocks are buffered to construct a buffered and compressed packet 76 which may be divided again if necessary to create stream 78. Then, POC synchronization is added by DSP chip 46 and new data stream 80 may be transmitted to decoder 53 via, for example, the internet 52 (FIG. 2) where it is decoded as described above.

Abstract

A method of lossless compression of a stream of data first includes using a plurality of lossless coders to compress a test portion of the data stream (30). Once the test portion is compressed, the method determines a performance characteristic(s) associated with each of the lossless coders (32). Then the method selects one of the lossless coders based on the performance characteristic(s) and encodes a first portion of the data stream with the selected coder. Thereafter, the method includes repeating the using, determining, selecting and encoding steps for another test portion and a second portion of the data stream. Notably, the repeating step may include selecting a different one of the lossless coders.

Description

METHOD AND APPARATUS FOR OPTIMIZED LOSSLESS COMPRESSION USING A PLURALITY OF CODERS
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is directed to data compression techniques and, more particularly, to a method and apparatus for selecting among different types of lossless compression coders to optimize system performance.
2. Description of the Related Art
Data compression operates to minimize the number of bits used to store or transmit information and encompasses a wide array of software and hardware compression techniques. Notably, depending on the type of data to be compressed and any number of other factors, particular compression techniques can provide markedly superior performance in terms of compression ratio and coding speed.
Generally, data compression includes taking a stream of symbols or phrases and converting them into codes that are smaller (in bit length) than the original data. Known compression techniques and algorithms can be divided into two major families including lossy and lossless. Lossy data compression can be used to greatly increase data compression ratios; however, increased compression comes at the expense of a certain loss in accuracy. As a result, lossy compression typically is implemented only in those instances in which some data loss is acceptable. For example, lossy compression is used effectively used when applied to digitized voice signals and graphics images. Lossless compression, on the other hand, is a family of data compression that utilizes techniques designed to generate an exact duplicate of the input data stream after a compression/decompression cycle. This type of compression is necessary when storing database records, word processing files, etc. , where loss of information is absolutely unacceptable. The present invention is directed to lossless data compression.
Some lossless compression algorithms use information theory to generate variable length codes when given a probability table for a given set of symbols. The decision to output a certain code for a particular symbol or set of symbols (i.e. , message) is based on a model. The model is a set of rules used to process input messages, and in response, determine which codes to output. An algorithm or program uses the model to analyze the symbols (e.g., determine a probability associated with the symbol) and then outputs an appropriate code based on that processing. There are any number of ways to model data, all of which can use the same coding technique to produce their output. In general, to compress data efficiently, a model should be selected that predicts symbols or phrases with high probabilities because symbols or messages that have a high probability have a low information content, and therefore require fewer bits to encode. The next step is to encode the symbols using a particular lossless coder. Conventionally, lossless compression coders can be grouped according to whether they implement statistical modeling or dictionary-based modeling. Statistical modeling reads and encodes a single symbol at a time using the probability of the character's appearance, while dictionary-based modeling uses a single code to replace strings of symbols. Notably, in dictionary-based modeling, the model is significantly more important than in statistical-based modeling because problems associated with encoding every symbol are significantly reduced.
One form of statistical data compression is known as Shannon-Fano (S-F) coding. S-F coding was developed to provide variable-length bit coding so as to allow coding symbols with exactly (or a close approximation to) the number of bits of information that the message or symbol contains. S-F coding relies on knowing the probability of each symbol's appearance in a message. After the probabilities are determined, a table of codes is constructed with each code having a different number of bits (advantageously, symbols with low probabilities have more bits). One problem with a coding technique such as this is that it creates variable length codes that have an integral number of bits, even though the information to be coded likely will require a non- integral number of bits.
Another type of coding, Huffman coding, is similar to S-F coding in that it creates variable length codes that are an integral number of bits, but it utilizes a completely different algorithm. Generally, S-F and Huffman codings are close in performance but Huffman coding, it has been determined, always at least equals the efficiency of S-F coding so it is therefore preferred, especially since both algorithms take a similar amount of processing power. Although Huffman coding is relatively easy to implement and economical for both coding and decoding, it is inefficient due to its use of an integral number of bits per code as in S-F coding. If a particular symbol is determined to have an information content (i.e. , entropy) of 1.5 bits, a Huffman coder will generate a code having a bit count that is either one or two bits. Generally, if a statistical method could assign a 90% probability to a given symbol, the optimal code size would be 0.15 bits; however, Huffman or S-F coding likely would assign a one bit code to the symbol, which is six times larger than necessary. In view of this problem associated with utilizing an integral number of bits, arithmetic coding was developed. Arithmetic coding replaces a stream of input symbols with a single floating point output number, and bypasses the step of replacing an input symbol with a specific code. Because an arithmetic code is not limited to being optimal only when the symbol probabilities are integral powers of one-half (which is most often not the case), it attains the theoretical entropy of the symbol to be coded, thus maximizing compression efficiency for any known source. In other words, if the entropy of a given character is 1.5 bits, arithmetic coding uses 1.5 bits to encode the symbol, an impossibility for Huffman and Shannon-Fano coding. Although arithmetic coding is extremely efficient, it consumes rather large amounts of computing resources, both in terms of CPU power and memory. This is due to the fact that sophisticated models that demand a significant amount of memory must be built, and that the algorithm itself requires a significant amount of computational operations.
In an alternative to the above types of lossless coding, known as substitutional or dictionary-based coding, dictionary-based compression algorithms replace occurrences of particular phrases (i.e., groups of bytes) in a data stream with a reference to a previous occurrence of those phrases. Unlike the above systems that achieve compression by encoding symbols into bit strings that use fewer bits than the original symbols, dictionary -based algorithms do not encode single symbols. Rather, dictionary -based compression techniques encode variable length strings of symbols as single "tokens. " It is these tokens that form an index to a phrase dictionary. Because the tokens are smaller than the phrases they replace, compression occurs. Two main classes of dictionary -based compression schemes are known as the LZ77 and LZ78 compression algorithms of the Lempel-Ziv family of compression coders. Notably, dictionary-based coding is utilized extensively in desktop general purpose compression and has been implemented by CompuServe Information Service to encode bit-mapped graphical images. For example, the GIF format uses a LZW variant to compress repeated sequences and screen images. Although dictionary-based compression techniques are very popular forms of compression, the disadvantage of such algorithms is that a more sophisticated data structure is needed to handle the dictionary.
Overall, as communication mediums such as the internet expand, data compression will continue to be extremely important to the efficient communication of data, with different compression algorithms providing particular advantages in particular arenas. There are many types of data compression methods that are being implemented in the art, including those described above as well as others. In addition, there are many variants associated with each type of known compression algorithm and many improvements have been developed. Again, depending on any number of factors associated with the system and the type of data being compressed, each may be preferred to provide optimum data encoding.
Because different ones of known coding techniques provide unique benefits depending upon various operational factors including the data to be encoded, a lossless compression system that selectively encodes data with different types of coders was desired. The telecommunications industry, in particular, is in need of a system which implements different types of coders, especially when the incoming data is received from multiple sources that provide different types of unknown data, i.e., when different portions of the data stream would be optimally compressed with different coding techniques. SUMMARY OF THE INVENTION The present invention is directed to a method and apparatus that determines which of a number of embedded coding schemes will optimally compress different portions of an incoming data stream. The method of the preferred embodiment is designed to accommodate a data stream characterized by having different packets of information (e.g., from sources unknown to the encoders) each of which may have different associated statistics.
According to a first aspect of the preferred embodiment, a method of lossless compression of a stream of data includes providing a plurality of lossless coders. The method then includes selecting one of the lossless coders to compress the stream of data, and thereafter encoding the data stream with the selected lossless coder.
According to another aspect of the preferred embodiment, a method of lossless compression of a stream of data includes using a plurality of lossless coders to compress a test portion of the data stream. Once the test portion is compressed, the method determines a performance characteristic associated with each of the lossless coders. Then the method includes selecting one of the lossless coders based on the determining step and encoding a first portion of the data stream with the selected coder. Thereafter, the method includes repeating the using, determining, selecting and encoding steps for another test portion and a second portion of the data stream. Notably, the repeating step may include selecting a different one of the lossless coders.
According to a further aspect of the preferred embodiment, each of the lossless coders uses (1) a compression technique, and (2) a number of bits per word determined by the selecting step, in the encoding step. And, the compression technique is one of Arithmetic coding, Huffman coding and LZ coding.
According to yet another aspect of the preferred embodiment, an apparatus for lossless data compression includes an interface to receive a stream of data. In addition, the apparatus includes a plurality of lossless coders and a processor. In operation, each lossless coder separately compresses a test portion of the data stream and, in response, the processor determines a performance characteristic associated with each of the lossless coders, and then selects, based on the performance characteristics, one of the lossless coders to encode at least a first portion of the data stream.
According to a still further aspect of the preferred embodiment, the performance characteristic includes at least one of compression ratio and duration of the compression of the test portion for a corresponding lossless coder. Moreover, the encoder includes a plurality of processors and each of the lossless coders corresponds to one of the processors, and wherein the lossless coders compress the same test portion in parallel. These and other objects, advantages, and features of the invention will become apparent to those skilled in the art from the detailed description and the accompanying drawings. It should be understood, however, that the detailed description and accompanying drawings, while indicating preferred embodiments of the present invention, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the present invention without departing from the spirit thereof, and the invention includes all such modifications.
BRIEF DESCRIPTION OF THE DRAWINGS Preferred exemplary embodiments of the invention are illustrated in the accompanying drawings in which like reference numerals represent like parts throughout, and in which:
FIG. 1 is a flow diagram showing the general operation of a method of the preferred embodiment;
FIG. 1A is a chart showing an array of lossless coders used in the method shown in FIG. 1;
FIG. 2 is a generic block diagram showing an encoding/decoding system of the preferred embodiment; and
FIG. 3 is a schematic diagram showing the data stream as it is encoded/decoded by the system shown in FIG. 2. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Referring to Fig. 1, a method 10 includes, after initialization and start-up at Step 12, inputting data to the system at Step 14. The data input at Step 14 may be synchronous or asynchronous data. Notably, the data stream may be received from unspecified sources such as sensors that monitor temperature, pressure, etc. of a subject (e.g. , telemetric data gathered in military applications) and that continuously transmit readings to the system encoder (described below) of the preferred embodiment. Unspecified data necessarily implies that the statistics associated with the data are random, and therefore unlike known systems that perform compression with a single type of encoder based on knowledge regarding the statistics of the data, the preferred embodiment is able to efficiently code a data stream comprised of different types of data. Other types of applications where this type of random data may originate from multiple sources include hospital monitoring applications, chemical factories, nuclear plants, and others. As the data is continuously input to the system, it is transmitted to a division on communication block where method 10, at Step 16, processes the data by dividing or framing the data for further communication thereof. The division on communication block is implemented by method 10 in conventional fashion. Next, at Step 18, the data is pre-processed which may include generating a histogram indicating the statistics associated with the data framed in Step 16.
Once the data has been pre-processed at Step 18, method 10 adds synchronization and header codes at Step 20, as required to further process and identify the data bits in the data stream. Upon completion of Step 20, the data is transmitted to a plurality of coders that provide lossless compression. In particular, at Step 22, method 10 codes a test portion of the data stream with a plurality of lossless coders and determines system performance criteria associated with each of the coders. The coders used to code the portion of the data in Step 22 are shown at 32 in chart 30 of Fig. 1A. The columns of the charts indicate different types of lossless coding techniques/algorithms which may include Huffman coding, Arithmetic coding, Lempel-Ziv coding, as well as variants of these and other known coding techniques. Notably, the method also compares the output of the coding techniques with the data stream without encoding/compressing because, in certain circumstances, uncompressed data may be optimum.
In general, the columns comprise lossless coding techniques. The rows comprise different designations for the number of bits per word, bpw 1-m, may be used to encode the data. For instance, the bits per word associated with the interface may be set to be eight bits, ten bits, etc., for example. As a result, at Step 22, method 10 codes a portion of the data with n x m number of lossless coders. Preferably, Step 22 is performed for a test period of time or amount of data to determine which of the lossless coders achieves optimum system performance prior. Thereafter, the data is encoded (described below).
The test compression performed by coders 32 in Step 22 preferably is conducted in parallel to quickly compile data corresponding to each of the lossless coders. Parallel coding of the test data is possible due to the fact that computing power has become so inexpensive that the benefits (e.g., in terms of encoding speed) greatly outweigh the costs. Nevertheless, in an alternative embodiment, each of coders 32 shown in Fig. 1A may code the test data sequentially over a designated period of time to produce the corresponding performance data. Although not preferred, such a sequential test may be performed when computing power is at a premium.
Turning to Table 1 below, the performance criteria generated in Step 22 for nine different lossless coders (three different word lengths x three different coding techniques) is shown. Note first that the input bit rate is set at a predetermined value, while the output bit rate, although preferably set based on the transmission medium employed, may be continuously updated based on feedback information regarding the lossless coder employed. Optimally, the output bit rate will be made as small as possible. As shown in Table 1, after compressing a test amount of data, an output file size in bytes, a compression ratio, and a time to encode are each determined for a designated speed input (kbit/second) and speed output (kbit/second). For example, for an input file having 304, 180,992 bytes and when using eight bits per word, Huffman coding achieves a compression ratio of 1.8272, Lempel-Ziv coding achieves a ratio of 2.505, and arithmetic coding achieves a ratio of 2.7724. In addition, the time to encode the test data for each of these algorithms is 128 seconds, 522 seconds, and 1,582 seconds, respectively. Once the performance criteria are generated for each lossless encoder, method 10 executes step 24 to select one of the coders to code, compress the data for a predetermined amount of time or for a particular amount of data.
Notably, the selection made in Step 24 typically is not based solely on compression ratio realized but rather the selection is made based on a combination of overall processing time and compression ratio performance characteristics. For example, in Table 1, arithmetic coding, for eight bits per word, achieves a compression ratio of 2.7724 which is greater than the compression ratio achieved for Lempel-ZIV coding, 2.505. However, arithmetic coding takes more than fifteen minutes longer to encode than the Lempel-Ziv lossless coder. In this case, method 10 likely would select the Lempel-Ziv coder in Step 24. However, if the performance achieved by all n x m lossless coders does not satisfy a threshold level, method 10 may decide to send the data uncompressed. This decision depends on, among other things, user requirements.
The input clock rate indicated in Table 1 is dependent upon both the media over which the data is transmitted (internet, for example) and the type of coding algorithm implemented. The time performance criteria is generated according to the following equation, toverall = tc + tprocessing. (Eq. 1)
In Equation 1, tprocessing includes the time duration associated with compressing the data, system delay, etc. Further, tc is the time to transmit the data and equals the size of the file divided by the compression ratio and by the output speed, i.e. , the bit rate, and reflects the time savings achieved by compressing the data. The compression ratio (CR) being equal to the input file size divided by the output file size. Table 1
Figure imgf000011_0001
Once a coder (n,m) 32 (Fig. 1A) is selected in Step 24, method 10 encodes the data with the selected coder for, preferably, a predetermined amount of time. Thereafter, the program returns to Step 22 to code a new test portion of the data stream and select an optimum coder for encoding the next portion of the data stream. This operation may require implementing a different lossless coder shown in Chart 30.
Turning to FIG. 2, a system 40 for performing method 10 includes an encoder 41 having an input interface 42 which includes a clock input Cl and a data input Dl for receiving a data stream 43 that is either synchronous or asynchronous. Interface 42 is coupled to a digital signal processor (DSP) chip 46 via input and output data-control-synchronous input/output lines 44. Notably, DSP 46 preferably performs steps 16 and 18 in method 10 shown in FIG. 1 to frame the data and prepare it for compression. The output of DSP 46 is coupled to computer 50 via a PCI bus 48 that communicates the framed data to the computer. Computer 50, preferably, adds appropriate header codes to the data stream to indicate different packets of data and operates to encode/compress the test data with each lossless coder shown in chart 30. As noted above, computer 50 may comprise a plurality of processors each capable of encoding/compressing data for a corresponding one of the lossless coders implemented from the grid 30 in Fig. 1A. Alternatively, a single computer 50 could be used to implement the test compression for each of the lossless coders 32 in a sequential fashion for predetermined period of time.
Computer 50 may also be used to add header codes to the data to ensure that the file will be decompressed correctly. The compressed data is then transmitted via PCI bus 48 to the DSP chip 46 to divide the data as necessary for the specific communication system implemented. This process may involve buffering the data by inserting empty blocks and/or deleting existing blocks. Thereafter, particular synchronization codes may be added and the data stream is transmitted along the input/output lines back to interface 42. The particular interface code settings include designating the number of bits/word, the number of words per frame, synchronizing codes, a control sum, etc. Interface 42 then outputs the data stream on line D2, so that it may be transmitted over a medium 52 such as the internet. Preferably, the output clock rate C2 is set by the operator and is dependent upon the type of medium 52 implemented.
Next, the decoder 53 of system 40 includes an interface 54 having a data input D3 for receiving the compressed data from medium 52 at a clock rate C3 that corresponds to clock rate C2 output from interface 42. Notably, clocks C2 and C3 are optional. Interface 54 transmits the compressed data stream via data-control- synchronous lines 56, while the POC synchronization added by encoder 41 is deleted. Thereafter, a DSP chip 58 detects the header code(s) and removes empty blocks from the data stream. The data processed by DSP chip 58 is then transmitted to a computer 62 via a PCI bus 60. Computer 62 decompresses the data, and, preferably, implements conventional control sum check (CSC) comparison techniques. Additional error detection or error correction coders may also be implemented by computer 62. A Reed-Solomon error correction coder is standard for communication networks and is preferably included. Notably, the above- described processing operations may be performed either by computer(s) 50, 62 or DSP chips 46, 48, but the preferred implementation has been described. The compressed data is then transmitted back to interface 54 and onto data line D4 at a clock rate C4 = Cl .
A representation of the method steps described in FIG. 1 and performed by the apparatus shown in Fig. 2 is shown schematically in FIG. 3 for a telemetric data stream. The arrow labeled A on the right-side of FIG. 3 indicates the encoding process, while the arrow B along the left-side of the data shown in FIG. 3 indicates the decoding process. More particularly, data stream 43 is input to interface 42 (FIG. 2) and then framed in a preassigned fashion into packet portions 64, 66 (preferably several kilobytes, e.g., two 8k portions), preferably by DSP 46. Thereafter, portions 64, 66 are compressed into, for example, a 4.5k block 68 and a 4.3k block 70. Then, a header 73, 75 is added to the blocks of the packets (with the interface information described above) to create blocks 72, 74, respectively. Then, the blocks are buffered to construct a buffered and compressed packet 76 which may be divided again if necessary to create stream 78. Then, POC synchronization is added by DSP chip 46 and new data stream 80 may be transmitted to decoder 53 via, for example, the internet 52 (FIG. 2) where it is decoded as described above.
Many changes and modifications may be made within the scope of the present invention without departing from the spirit thereof. Other changes and modifications falling within the scope of the invention will become apparent from the appended claims.

Claims

What is claimed is:
1. A method of lossless compression of a stream of data, the method comprising the steps of: providing a plurality of different types of lossless coders; selecting one of the lossless coders to compress the data stream; and encoding the data stream with the selected lossless coder.
2. The method of claim 1, further comprising the step of, prior to said selecting step, individually compressing at least a portion of the data stream with each of the lossless coders; and wherein said selecting step is performed based on said compressing step.
3. The method of claim 1 , wherein said selecting step is based on a performance characteristic associated with said compressing step.
4. The method of claim 3, wherein the performance characteristic includes at least one of compression ratio and duration of said compressing step for a corresponding lossless coder.
5. The method of claim 1 , wherein at least one of the lossless coders uses statistical modeling.
6. The method of claim 5, wherein at least another of the lossless coders uses dictionary-based modeling.
7. The method of claim 2, wherein the lossless coders perform said compressing step in parallel.
8. The method of claim 2, wherein the lossless coders perform said compressing step sequentially.
9. The method of claim 1, wherein the lossless coders are defined in part by a number of bits per word used in said encoding step.
10. A method of lossless compression of a stream of data, the method comprising the steps of: using a plurality of different types of lossless coders to compress a test portion of the data stream; determining a performance characteristic associated with each of the lossless coders in response to said using step; selecting one of the lossless coders based on said determining step; encoding a first portion of the data stream with the selected coder; and repeating said using, determining, selecting and encoding steps for another test portion and a second portion of the data stream.
11. The method of claim 10, wherein said repeating step includes selecting a different one of the lossless coders.
12. The method of claim 10, wherein the lossless encoders perform said using step in parallel.
13. The method of claim 10, wherein the lossless encoders perform said using step sequentially.
14. The method of claim 10, wherein each of the lossless coders uses (1) a compression technique, and (2) a number of bits per word, the number being determined by said selecting step, in said encoding step.
15. The method of claim 14, wherein the compression technique is one of Arithmetic coding, Huffman coding and LZ coding.
16. The method of claim 10, wherein the data stream comprises data from a plurality of different sources.
17. The method of claim 10, wherein the performance characteristic includes at least one of compression ratio and duration of said using step for a corresponding lossless coder.
18. An apparatus for lossless data compression, the apparatus comprising: an interface to receive a stream of data; a plurality of different types of lossless coders; a processor; and wherein each said lossless coder separately compresses a test portion of the data stream and, in response, said processor (1) determines a performance characteristic associated with each said lossless coder, and (2) selects, based on said performance characteristics, one of said lossless coders to encode at least a first portion of the data stream.
19. The method of claim 18, wherein the performance characteristic includes at least one of compression ratio and duration of the compression of the test portion.
20. The apparatus of claim 18, wherein said encoder includes a plurality of processors and each said lossless coder corresponds to one of the processors, and wherein said lossless coders compress the same test portion in parallel.
21. A method of claim 18, wherein the data stream comprises data from a plurality of different sources.
22. An apparatus for lossless data compression, the apparatus comprising: an encoder including an interface to receive a stream of data, a plurality of different types of lossless coders and a processor, wherein each said lossless coder separately compresses a test portion of the data stream and, in response, said processor (1) determines a performance characteristic associated with each said lossless coder, and (2) selects, based on said performance characteristics, one of said lossless coders to encode at least a first portion of the data stream; and a decoder that receives and decompresses said encoded first portion of the data stream.
23. A method of lossless compression of a stream of data, the method comprising the steps of: using a plurality of different types of lossless coders to compress a test portion of the data stream; determining a performance characteristic associated with each of the lossless coders in response to said using step; selecting one of the lossless coders based on said determining step; encoding a first portion of the data stream with the selected coder; and repeating said using, determining, selecting and encoding steps for another test portion and a second portion of the data stream, wherein said repeating step includes selecting a different one of the lossless coders; transmitting the encoded first portion via a communication medium to a decoder; and decompressing the encoded first portion.
24. The apparatus of claim 23, wherein the medium is the internet.
25. The apparatus of claim 23, wherein the performance characteristic includes at least one of compression ratio and duration of said using step for corresponding lossless coder.
26. The apparatus of claim 23, wherein the data stream comprises data from a plurality of different sources.
PCT/US2001/005722 2000-02-25 2001-02-22 Method and apparatus for optimized lossless compression using a plurality of coders WO2001063772A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU2001241672A AU2001241672A1 (en) 2000-02-25 2001-02-22 Method and apparatus for optimized lossless compression using a plurality of coders
EP01912942A EP1266455A4 (en) 2000-02-25 2001-02-22 Method and apparatus for optimized lossless compression using a plurality of coders
JP2001562848A JP2003524983A (en) 2000-02-25 2001-02-22 Method and apparatus for optimized lossless compression using multiple coders

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US51330900A 2000-02-25 2000-02-25
US09/513,309 2000-02-25

Publications (1)

Publication Number Publication Date
WO2001063772A1 true WO2001063772A1 (en) 2001-08-30

Family

ID=24042716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/005722 WO2001063772A1 (en) 2000-02-25 2001-02-22 Method and apparatus for optimized lossless compression using a plurality of coders

Country Status (6)

Country Link
EP (1) EP1266455A4 (en)
JP (1) JP2003524983A (en)
CN (1) CN1426629A (en)
AU (1) AU2001241672A1 (en)
TW (1) TWI273779B (en)
WO (1) WO2001063772A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8090936B2 (en) 2000-02-03 2012-01-03 Realtime Data, Llc Systems and methods for accelerated loading of operating systems and application programs
CN102568520A (en) * 2010-12-16 2012-07-11 富泰华工业(深圳)有限公司 Test device and method
CN102595496A (en) * 2012-03-08 2012-07-18 西北大学 Context-adaptive quotient and remainder encoding method used for sensing data of wireless sensing nodes
US8295615B2 (en) 2007-05-10 2012-10-23 International Business Machines Corporation Selective compression of synchronized content based on a calculated compression ratio
US8692695B2 (en) 2000-10-03 2014-04-08 Realtime Data, Llc Methods for encoding and decoding data
US8867610B2 (en) 2001-02-13 2014-10-21 Realtime Data Llc System and methods for video and audio data distribution
US8933825B2 (en) 1998-12-11 2015-01-13 Realtime Data Llc Data compression systems and methods
US9116908B2 (en) 1999-03-11 2015-08-25 Realtime Data Llc System and methods for accelerated data storage and retrieval
US9143546B2 (en) 2000-10-03 2015-09-22 Realtime Data Llc System and method for data feed acceleration and encryption
WO2015199856A1 (en) * 2014-06-26 2015-12-30 Intel Corporation Compression configuration identification
US10313256B2 (en) 2015-05-21 2019-06-04 Intel Corporation Apparatus and methods for adaptive data compression
CN111314277A (en) * 2019-11-13 2020-06-19 谢卓鹏 Compression method based on GNSS big data

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100369489C (en) * 2005-07-28 2008-02-13 上海大学 Embedded wireless coder of dynamic access code tactics
CN101615910B (en) 2009-05-31 2010-12-22 华为技术有限公司 Method, device and equipment of compression coding and compression coding method
CN102111161B (en) * 2010-11-16 2013-07-17 北京航天数控系统有限公司 Method and device for acquiring encoder data
EP3311494B1 (en) * 2015-06-15 2021-12-22 Ascava, Inc. Performing multidimensional search, content-associative retrieval, and keyword-based search and retrieval on data that has been losslessly reduced using a prime data sieve
US9209833B1 (en) * 2015-06-25 2015-12-08 Emc Corporation Methods and apparatus for rational compression and decompression of numbers

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5485526A (en) * 1992-06-02 1996-01-16 Hewlett-Packard Corporation Memory circuit for lossless data compression/decompression dictionary storage
US5708511A (en) * 1995-03-24 1998-01-13 Eastman Kodak Company Method for adaptively compressing residual digital image data in a DPCM compression system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0773249B2 (en) * 1989-06-29 1995-08-02 富士通株式会社 Speech encoding / decoding transmission method
CA2020084C (en) * 1989-06-29 1994-10-18 Kohei Iseda Voice coding/decoding system having selected coders and entropy coders
JPH07210324A (en) * 1994-01-13 1995-08-11 Hitachi Ltd Storage device
FI962381A (en) * 1996-06-07 1997-12-08 Nokia Telecommunications Oy Compressing data on a communication connection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5485526A (en) * 1992-06-02 1996-01-16 Hewlett-Packard Corporation Memory circuit for lossless data compression/decompression dictionary storage
US5708511A (en) * 1995-03-24 1998-01-13 Eastman Kodak Company Method for adaptively compressing residual digital image data in a DPCM compression system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1266455A4 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8933825B2 (en) 1998-12-11 2015-01-13 Realtime Data Llc Data compression systems and methods
US10033405B2 (en) 1998-12-11 2018-07-24 Realtime Data Llc Data compression systems and method
US9054728B2 (en) 1998-12-11 2015-06-09 Realtime Data, Llc Data compression systems and methods
US10019458B2 (en) 1999-03-11 2018-07-10 Realtime Data Llc System and methods for accelerated data storage and retrieval
US9116908B2 (en) 1999-03-11 2015-08-25 Realtime Data Llc System and methods for accelerated data storage and retrieval
US9792128B2 (en) 2000-02-03 2017-10-17 Realtime Data, Llc System and method for electrical boot-device-reset signals
US8112619B2 (en) 2000-02-03 2012-02-07 Realtime Data Llc Systems and methods for accelerated loading of operating systems and application programs
US8880862B2 (en) 2000-02-03 2014-11-04 Realtime Data, Llc Systems and methods for accelerated loading of operating systems and application programs
US8090936B2 (en) 2000-02-03 2012-01-03 Realtime Data, Llc Systems and methods for accelerated loading of operating systems and application programs
US9141992B2 (en) 2000-10-03 2015-09-22 Realtime Data Llc Data feed acceleration
US9667751B2 (en) 2000-10-03 2017-05-30 Realtime Data, Llc Data feed acceleration
US9859919B2 (en) 2000-10-03 2018-01-02 Realtime Data Llc System and method for data compression
US10284225B2 (en) 2000-10-03 2019-05-07 Realtime Data, Llc Systems and methods for data compression
US9143546B2 (en) 2000-10-03 2015-09-22 Realtime Data Llc System and method for data feed acceleration and encryption
US9967368B2 (en) 2000-10-03 2018-05-08 Realtime Data Llc Systems and methods for data block decompression
US10419021B2 (en) 2000-10-03 2019-09-17 Realtime Data, Llc Systems and methods of data compression
US8692695B2 (en) 2000-10-03 2014-04-08 Realtime Data, Llc Methods for encoding and decoding data
US8934535B2 (en) 2001-02-13 2015-01-13 Realtime Data Llc Systems and methods for video and audio data storage and distribution
US8867610B2 (en) 2001-02-13 2014-10-21 Realtime Data Llc System and methods for video and audio data distribution
US9762907B2 (en) 2001-02-13 2017-09-12 Realtime Adaptive Streaming, LLC System and methods for video and audio data distribution
US9769477B2 (en) 2001-02-13 2017-09-19 Realtime Adaptive Streaming, LLC Video data compression systems
US8929442B2 (en) 2001-02-13 2015-01-06 Realtime Data, Llc System and methods for video and audio data distribution
US10212417B2 (en) 2001-02-13 2019-02-19 Realtime Adaptive Streaming Llc Asymmetric data decompression systems
US8295615B2 (en) 2007-05-10 2012-10-23 International Business Machines Corporation Selective compression of synchronized content based on a calculated compression ratio
CN102568520A (en) * 2010-12-16 2012-07-11 富泰华工业(深圳)有限公司 Test device and method
CN102568520B (en) * 2010-12-16 2016-10-12 富泰华工业(深圳)有限公司 Test device and method
CN102595496A (en) * 2012-03-08 2012-07-18 西北大学 Context-adaptive quotient and remainder encoding method used for sensing data of wireless sensing nodes
US9681332B2 (en) 2014-06-26 2017-06-13 Intel Corporation Compression configuration identification
WO2015199856A1 (en) * 2014-06-26 2015-12-30 Intel Corporation Compression configuration identification
US10313256B2 (en) 2015-05-21 2019-06-04 Intel Corporation Apparatus and methods for adaptive data compression
CN111314277A (en) * 2019-11-13 2020-06-19 谢卓鹏 Compression method based on GNSS big data

Also Published As

Publication number Publication date
JP2003524983A (en) 2003-08-19
CN1426629A (en) 2003-06-25
EP1266455A4 (en) 2003-06-18
TWI273779B (en) 2007-02-11
EP1266455A1 (en) 2002-12-18
AU2001241672A1 (en) 2001-09-03

Similar Documents

Publication Publication Date Title
US5532694A (en) Data compression apparatus and method using matching string searching and Huffman encoding
Shanmugasundaram et al. A comparative study of text compression algorithms
EP1057269B1 (en) Block-wise adaptive statistical data compressor
US5652581A (en) Distributed coding and prediction by use of contexts
US5955976A (en) Data compression for use with a communications channel
EP1266455A1 (en) Method and apparatus for optimized lossless compression using a plurality of coders
EP0695040B1 (en) Data compressing method and data decompressing method
EP0793349A2 (en) Method and apparatus for performing data compression
EP0582907A2 (en) Data compression apparatus and method using matching string searching and Huffman encoding
JPH01125028A (en) Method and apparatus for compression of compatible data
US5877711A (en) Method and apparatus for performing adaptive data compression
US6919826B1 (en) Systems and methods for efficient and compact encoding
JPS6356726B2 (en)
US6748520B1 (en) System and method for compressing and decompressing a binary code image
US7342902B2 (en) Two stage loss-less compressor for a clear channel over a packet network
CN101534124A (en) Compression algorithm for short natural language
US6292115B1 (en) Data compression for use with a communications channel
Rathore et al. A brief study of data compression algorithms
Gupta et al. A review on different types of lossless data compression techniques
Senthil et al. Text compression algorithms: A comparative study
KR100462789B1 (en) method and apparatus for multi-symbol data compression using a binary arithmetic coder
US20080001790A1 (en) Method and system for enhancing data compression
CN1656688B (en) Processing digital data prior to compression
US11722149B2 (en) Deflate compression using sub-literals for reduced complexity Huffman coding
Rani et al. A survey on lossless text data compression techniques

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 562848

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 2001912942

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 018085873

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2001912942

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2001912942

Country of ref document: EP