lossless compression in hardware: what to do in case of uncompressibility?| page 2

Reply by Hans-Peter Diettrich ●December 2, 20072007-12-02

cms wrote:

> It isin't right to applicate lossless algorithms to fixed-bandwidth
> systems.

Depends on the algorithm. E.g. RLE will never cause an increase in size, 
what's sufficient even for use with fixed-bandwidth channels.

DoDi

Reply by ●December 2, 20072007-12-02

Hans-Peter Diettrich wrote:

> Depends on the algorithm. E.g. RLE will never cause an increase in size, 

Wrong.  It's trivially easy to prove that *no* lossless compression 
algorithm can avoid to sometimes cause an increase in size, while still 
compressing at least some inputs:

There are 2^N different inputs with N bits, and as long as one of them 
is compressed by at least 1 bit, that means you have only 2^N-2 possible 
outputs left to represent the remaining 2^N-1 inputs with --- impossible.

[F'up2 comp.compression, which is the only place this could possibly be 
on-topic]

Reply by Pasi Ojala ●December 2, 20072007-12-02

On 2007-12-02, Hans-Peter Diettrich <DrDiettrich1@aol.com> wrote:
> Depends on the algorithm. E.g. RLE will never cause an increase in size

Even RLE can increase the size. Two of the most used methods to
incidate runs are 1) prefix byte, 2) two consequtive bytes are
the same. For both of these methods you can construct (or the
file sometimes happens to be) a file containing 1) the value
of the prefix byte 2) exactly two equal bytes.

Even if you use some other method of indicating the runs, you
can always constuct the pathological file that increases in size
after compression. This fact does not change depending on what
the compression algorithm is.

The best you can do is have one bit to select uncompressed/compressed.

followups set to comp.compression

-Pasi
-- 
"You're the type of guy who doesn't like to give up control."
	-- Wade to Garibaldi in Babylon 5:"The Exercise of Vital Powers"

Reply by Pascal Peyremorte ●December 2, 20072007-12-02

Denkedran Joe a &#4294967295;crit :
> So my idea was to put the question to all of you what to do in case of 
> uncompressibility? Any ideas?

Hi,

Generally, a compression algorithm may be optimized for a type of data, or for a 
"packet" of data (ie with dynamic table building huffman compression, with 
different algorithms for video, sound or exe files, etc).

If you can find two (or more) complementary algorithms (or sets of fixed 
parameters for a single algo) that covers the whole flows, you can dynamically 
switch between them, without transmitting the parameters.

The known of the data flow may help you. Ie, if you may have to transmit an 
already compressed packet, your problem may have not any solution.

Reply by CBFalconer ●December 2, 20072007-12-02

Hans-Peter Diettrich wrote:
> cms wrote:
> 
>> It isin't right to applicate lossless algorithms to
>> fixed-bandwidth systems.
> 
> Depends on the algorithm. E.g. RLE will never cause an increase in
> size, what's sufficient even for use with fixed-bandwidth channels.

Nit.  Depends on the data.  You have to have something to separate
a repetition count from a new value, which implies a lost char (or
more) somewhere.  However, any such expansion can be expected to be
small.

-- 
 Chuck F (cbfalconer at maineline dot net)
   <http://cbfalconer.home.att.net>
   Try the download section.

-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by ●December 2, 20072007-12-02

CBFalconer <cbfalconer@yahoo.com> writes:
> Hans-Peter Diettrich wrote:
> > cms wrote:
> > 
> >> It isin't right to applicate lossless algorithms to
> >> fixed-bandwidth systems.
> > 
> > Depends on the algorithm. E.g. RLE will never cause an increase in
> > size, what's sufficient even for use with fixed-bandwidth channels.
> 
> Nit.  Depends on the data.  You have to have something to separate
> a repetition count from a new value, which implies a lost char (or
> more) somewhere.  However, any such expansion can be expected to be
> small.

Most RLEs can be caused to explode horribly, simply by 
peppering the file with the value used as the escape code(s).
Packbits solves this problem, and it will never expand 
more than a small fraction, as it escapes both blocks 
of literals and runs.

Phil
-- 
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

Reply by Boudewijn Dijkstra ●December 3, 20072007-12-03

Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe  
<denkedranjoe@googlemail.com>:
> I'm working on a hardware implementation (FPGA) of a lossless compression
> algorithm for a real-time application. The data will be fed in to the
> system, will then be compressed on-the-fly and then transmitted further.
>
> The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> certain size and start reading data out of the FIFO after a fixed
> startup-time. The readout rate will be 1/3 of the input data rate The  
> size
> of the FIFOs is determined by the experimental variance of the mean
> compression ratio. Nonetheless there are possible circumstances in which  
> no compression can be achieved.

Given that uncompressible data often resembles noise, you have to ask  
yourself: what would be lost?

> Since the overall system does not support
> variable bitrates a faster transmission is no solution here.
>
> So my idea was to put the question to all of you what to do in case of
> uncompressibility? Any ideas?

If you can identify the estimated compression beforehand and then split  
the stream into a 'hard' part and an 'easy' part, then you have a way to  
retain the average.



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

Reply by ●December 3, 20072007-12-03

"Boudewijn Dijkstra" <boudewijn@indes.com> writes:
> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
> <denkedranjoe@googlemail.com>:
> > I'm working on a hardware implementation (FPGA) of a lossless compression
> > algorithm for a real-time application. The data will be fed in to the
> > system, will then be compressed on-the-fly and then transmitted further.
> >
> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> > certain size and start reading data out of the FIFO after a fixed
> > startup-time. The readout rate will be 1/3 of the input data rate
> > The  size
> > of the FIFOs is determined by the experimental variance of the mean
> > compression ratio. Nonetheless there are possible circumstances in
> > which  no compression can be achieved.
> 
> Given that uncompressible data often resembles noise, you have to ask
> yourself: what would be lost?


*Much* more information than if the signal was highly redundant.


> > Since the overall system does not support
> > variable bitrates a faster transmission is no solution here.
> >
> > So my idea was to put the question to all of you what to do in case of
> > uncompressibility? Any ideas?
> 
> If you can identify the estimated compression beforehand and then
> split  the stream into a 'hard' part and an 'easy' part, then you have
> a way to  retain the average.


Yeah, right. And if you juggle the bowling balls when crossing 
the rope-bridge, you'll not break the bridge.


Phil
-- 
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

Reply by rickman ●December 3, 20072007-12-03

On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
> <denkedran...@googlemail.com>:
>
> > I'm working on a hardware implementation (FPGA) of a lossless compression
> > algorithm for a real-time application. The data will be fed in to the
> > system, will then be compressed on-the-fly and then transmitted further.
>
> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> > certain size and start reading data out of the FIFO after a fixed
> > startup-time. The readout rate will be 1/3 of the input data rate The
> > size
> > of the FIFOs is determined by the experimental variance of the mean
> > compression ratio. Nonetheless there are possible circumstances in which
> > no compression can be achieved.
>
> Given that uncompressible data often resembles noise, you have to ask
> yourself: what would be lost?

The message!  Just because the message "resembles" noise does not mean
it has no information.  In fact, just the opposite.  Once you have a
message with no redundancy, you have a message with optimum
information content and it will appear exactly like noise.

Compression takes advantage of the portion of a message that is
predictable based on what you have seen previously in the message.
This is the content that does not look like noise.  Once you take
advantage of this and recode to eliminate it, the message looks like
pure noise and is no longer compressible.  But it is still a unique
message with information content that you need to convey.

> > Since the overall system does not support
> > variable bitrates a faster transmission is no solution here.
>
> > So my idea was to put the question to all of you what to do in case of
> > uncompressibility? Any ideas?
>
> If you can identify the estimated compression beforehand and then split
> the stream into a 'hard' part and an 'easy' part, then you have a way to
> retain the average.

Doesn't that require sending additional information that is part of
the message?  On the average, this will add as much, if not more to
the message than you are removing...

If you are trying to compress data without loss, you can only compress
the redundant information.  If the message has no redundancy, then it
is not compressible and, with *any* coding scheme, will require some
additional bandwidth than if it were not coded at all.

Think of your message as a binary number of n bits.  If you want to
compress it to m bits, you can identify the 2**m most often
transmitted numbers and represent them with m bits.  But the remaining
numbers can not be transmitted in m bits at all.  If you want to send
those you have to have a flag that says, "do not decode this number".
Now you have to transmit all n or m bits, plus the flag bit.  Since
there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
bits, I think you will find the total number of bits is not less then
just sending all messages with n bits.  But if the messages in the m
bit group are much more frequent, then you can reduce your *average*
number of bits sent.  If you can say you will *never* send the numbers
that aren't in the m bit group, then you can compress the message
losslessly in m bits.

Reply by Boudewijn Dijkstra ●December 4, 20072007-12-04

Op Mon, 03 Dec 2007 18:27:50 +0100 schreef rickman <gnuarm@gmail.com>:
> On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
>> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
>> <denkedran...@googlemail.com>:
>>
>> > I'm working on a hardware implementation (FPGA) of a lossless  
>> compression
>> > algorithm for a real-time application. The data will be fed in to the
>> > system, will then be compressed on-the-fly and then transmitted  
>> further.
>>
>> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
>> > certain size and start reading data out of the FIFO after a fixed
>> > startup-time. The readout rate will be 1/3 of the input data rate The
>> > size
>> > of the FIFOs is determined by the experimental variance of the mean
>> > compression ratio. Nonetheless there are possible circumstances in  
>> which
>> > no compression can be achieved.
>>
>> Given that uncompressible data often resembles noise, you have to ask
>> yourself: what would be lost?
>
> The message!  Just because the message "resembles" noise does not mean
> it has no information.  In fact, just the opposite.

If you are compressing reliably transmitted pure binary data, then you are  
absolutely right.  But if there is less information per datum, like in an  
analog TV signal, something that resembles noise might very well be noise.

> Once you have a
> message with no redundancy, you have a message with optimum
> information content and it will appear exactly like noise.
>
> Compression takes advantage of the portion of a message that is
> predictable based on what you have seen previously in the message.
> This is the content that does not look like noise.  Once you take
> advantage of this and recode to eliminate it, the message looks like
> pure noise and is no longer compressible.  But it is still a unique
> message with information content that you need to convey.
>
>
>> > Since the overall system does not support
>> > variable bitrates a faster transmission is no solution here.
>>
>> > So my idea was to put the question to all of you what to do in case of
>> > uncompressibility? Any ideas?
>>
>> If you can identify the estimated compression beforehand and then split
>> the stream into a 'hard' part and an 'easy' part, then you have a way to
>> retain the average.
>
> Doesn't that require sending additional information that is part of
> the message?

Usually, yes.

> On the average, this will add as much, if not more to
> the message than you are removing...

Possibly.

> If you are trying to compress data without loss, you can only compress
> the redundant information.  If the message has no redundancy, then it
> is not compressible and, with *any* coding scheme, will require some
> additional bandwidth than if it were not coded at all.
>
> Think of your message as a binary number of n bits.  If you want to
> compress it to m bits, you can identify the 2**m most often
> transmitted numbers and represent them with m bits.  But the remaining
> numbers can not be transmitted in m bits at all.  If you want to send
> those you have to have a flag that says, "do not decode this number".
> Now you have to transmit all n or m bits, plus the flag bit.  Since
> there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
> bits, I think you will find the total number of bits is not less then
> just sending all messages with n bits.  But if the messages in the m
> bit group are much more frequent, then you can reduce your *average*
> number of bits sent.  If you can say you will *never* send the numbers
> that aren't in the m bit group, then you can compress the message
> losslessly in m bits.



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/