On Dec 5, 3:57 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Tue, 04 Dec 2007 17:50:46 +0100 schreef rickman <gnu...@gmail.com>:
>
> > But noise and signal that "resembles noise" are two different things.
> > You can characterize noise and send a description of it.  But it is
> > *isn't* noise you have just turned part of your signal into noise.  So
> > to take advantage of the fact that noise can be compressed by saying
> > "this is noise" requires that you separate the noise from the signal.
> > If you could do that, why would you even transmit the noise?  You
> > wouldn't, you would remove it.
>
> If you could, yes.  Costs put limits on the available processing power.

That is irrelevant to the conversation.  No one has mentioned data
rates, processing power or time requirements.  So there is no point is
raising issues that we have no information on.  But my point remains.
If your algorithm can remove the true noise separated from signal that
looks like noise, then you would just toss the real noise and improve
the signal while compressing at the same time.  That is my point and
from what you have said, it requires no extra processing, but is part
of your compression.

> > How can you flag the "easy" (compressible) part vs. the "hard" part
> > without sending more bits?
>
> In the context of the OP's hardware implementation, you may be able to
> distribute these two streams over the available output pins without
> sending extra bits.

If the OP has two streams, one for compressible signal and one for
uncompressible signal, then he could just send the original message
over the uncompressible channel and avoid the issue of compression
altogether.

> > As I describe below, compression only saves bits if your *average*
> > content has sufficient redundancy.  So what does "possibly" mean?
>
> If compression saves 'a lot of' bits ands flagging needs 'a few' bits,
> then it will not "add as much, if not more to the message than [I am]
> removing..."  Your description below only applies to certain compression
> algorithms, so any conclusion derived from it may or may not apply to the
> general case.

Compression can only save bits in the subset of signals that actually
are reducible.  If you do the math you will find that if the signal is
randomly distributed, any coding scheme can not reduce the total
number of bits sent.  It is only when some signal patterns are more
frequent than others that you can exploit the non-randomness of the
signal to compress it into fewer bits.

I haven't said anything about algorithms, so everything I have said on
this applies to *all* compression algorithms.  My discussion below is
intended to apply to the general case.  That is why I reduce the
discussion to one of compressing a large number to a smaller number.

> >> > If you are trying to compress data without loss, you can only compress
> >> > the redundant information.  If the message has no redundancy, then it
> >> > is not compressible and, with *any* coding scheme, will require some
> >> > additional bandwidth than if it were not coded at all.
>
> >> > Think of your message as a binary number of n bits.  If you want to
> >> > compress it to m bits, you can identify the 2**m most often
> >> > transmitted numbers and represent them with m bits.  But the remaining
> >> > numbers can not be transmitted in m bits at all.  If you want to send
> >> > those you have to have a flag that says, "do not decode this number".
> >> > Now you have to transmit all n or m bits, plus the flag bit.  Since
> >> > there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
> >> > bits, I think you will find the total number of bits is not less then
> >> > just sending all messages with n bits.  But if the messages in the m
> >> > bit group are much more frequent, then you can reduce your *average*
> >> > number of bits sent.  If you can say you will *never* send the numbers
> >> > that aren't in the m bit group, then you can compress the message
> >> > losslessly in m bits.
>
> --
> Gemaakt met Opera's revolutionaire e-mailprogramma:  http://www.opera.com/mail/

Op Wed, 05 Dec 2007 10:46:03 +0100 schreef comp.arch.fpga  
<ksulimma@googlemail.com>:
> On 5 Dez., 09:57, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
>> Your description below only applies to certain compression
>> algorithms, so any conclusion derived from it may or may not apply to  
>> the
>> general case.
> ROTFL.
> Did you even read it?

Yes.  Multiple times.

> He outlined the formal prove that I was referencing to in a little
> more detail.
>
> This proof shows, that for ANY lossless algorithm there is an input
> that can't be
> compressed. I find it rather funny that you counter that proof by the
> assertion that it only applies to certain algorithms.

I didn't mean to counter the proof itself, only claims to the relation  
between compression ratio and the bandwidth needed to split a stream.

> For the fun of it: Would you be so kind and present a single example
> of a compression
> algorithm that the proof does not apply to? Could be worth a PhD if
> you manage.

Nah.  I'd rather waste my time on something else. :)



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

On 5 Dez., 09:57, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Your description below only applies to certain compression
> algorithms, so any conclusion derived from it may or may not apply to the
> general case.
ROTFL.
Did you even read it?
He outlined the formal prove that I was referencing to in a little
more detail.

This proof shows, that for ANY lossless algorithm there is an input
that can't be
compressed. I find it rather funny that you counter that proof by the
assertion
that it only applies to certain algorithms.

For the fun of it: Would you be so kind and present a single example
of a compression
algorithm that the proof does not apply to? Could be worth a PhD if
you manage.

Kolja Sulimma

Op Tue, 04 Dec 2007 17:50:46 +0100 schreef rickman <gnuarm@gmail.com>:
> On Dec 4, 3:39 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
>> Op Mon, 03 Dec 2007 18:27:50 +0100 schreef rickman <gnu...@gmail.com>:
>>
>> > On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> ...snip...
>> >> Given that uncompressible data often resembles noise, you have to ask
>> >> yourself: what would be lost?
>>
>> > The message!  Just because the message "resembles" noise does not mean
>> > it has no information.  In fact, just the opposite.
>>
>> If you are compressing reliably transmitted pure binary data, then you  
>> are
>> absolutely right.  But if there is less information per datum, like in  
>> an
>> analog TV signal, something that resembles noise might very well be  
>> noise.
>
> But noise and signal that "resembles noise" are two different things.
> You can characterize noise and send a description of it.  But it is
> *isn't* noise you have just turned part of your signal into noise.  So
> to take advantage of the fact that noise can be compressed by saying
> "this is noise" requires that you separate the noise from the signal.
> If you could do that, why would you even transmit the noise?  You
> wouldn't, you would remove it.

If you could, yes.  Costs put limits on the available processing power.

> So the only type of "noise like" signal left is the part that *is*
> signal and the part that can't be separated from the signal.  Since
> you can't distinguish between the two, you have to transmit them both
> and suffer the inability to compress them.
>
>
>> > Once you have a
>> > message with no redundancy, you have a message with optimum
>> > information content and it will appear exactly like noise.
>>
>> > Compression takes advantage of the portion of a message that is
>> > predictable based on what you have seen previously in the message.
>> > This is the content that does not look like noise.  Once you take
>> > advantage of this and recode to eliminate it, the message looks like
>> > pure noise and is no longer compressible.  But it is still a unique
>> > message with information content that you need to convey.
>>
> ...snip...
>>
>> >> If you can identify the estimated compression beforehand and then  
>> split
>> >> the stream into a 'hard' part and an 'easy' part, then you have a  
>> way to
>> >> retain the average.
>>
>> > Doesn't that require sending additional information that is part of
>> > the message?
>>
>> Usually, yes.
>
> How can you flag the "easy" (compressible) part vs. the "hard" part
> without sending more bits?

In the context of the OP's hardware implementation, you may be able to  
distribute these two streams over the available output pins without  
sending extra bits.

>> > On the average, this will add as much, if not more to
>> > the message than you are removing...
>>
>> Possibly.
>
> As I describe below, compression only saves bits if your *average*
> content has sufficient redundancy.  So what does "possibly" mean?

If compression saves 'a lot of' bits ands flagging needs 'a few' bits,  
then it will not "add as much, if not more to the message than [I am]  
removing..."  Your description below only applies to certain compression  
algorithms, so any conclusion derived from it may or may not apply to the  
general case.

>> > If you are trying to compress data without loss, you can only compress
>> > the redundant information.  If the message has no redundancy, then it
>> > is not compressible and, with *any* coding scheme, will require some
>> > additional bandwidth than if it were not coded at all.
>>
>> > Think of your message as a binary number of n bits.  If you want to
>> > compress it to m bits, you can identify the 2**m most often
>> > transmitted numbers and represent them with m bits.  But the remaining
>> > numbers can not be transmitted in m bits at all.  If you want to send
>> > those you have to have a flag that says, "do not decode this number".
>> > Now you have to transmit all n or m bits, plus the flag bit.  Since
>> > there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
>> > bits, I think you will find the total number of bits is not less then
>> > just sending all messages with n bits.  But if the messages in the m
>> > bit group are much more frequent, then you can reduce your *average*
>> > number of bits sent.  If you can say you will *never* send the numbers
>> > that aren't in the m bit group, then you can compress the message
>> > losslessly in m bits.
>



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

On 4 Dez., 09:39, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Mon, 03 Dec 2007 18:27:50 +0100 schreef rickman <gnu...@gmail.com>:
>
>
>
> > On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> >> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
> >> <denkedran...@googlemail.com>:
>
> >> > I'm working on a hardware implementation (FPGA) of a lossless
> >> compression
> >> > algorithm for a real-time application. The data will be fed in to the
> >> > system, will then be compressed on-the-fly and then transmitted
> >> further.
>
> >> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> >> > certain size and start reading data out of the FIFO after a fixed
> >> > startup-time. The readout rate will be 1/3 of the input data rate The
> >> > size
> >> > of the FIFOs is determined by the experimental variance of the mean
> >> > compression ratio. Nonetheless there are possible circumstances in
> >> which
> >> > no compression can be achieved.
>
> >> Given that uncompressible data often resembles noise, you have to ask
> >> yourself: what would be lost?
>
> > The message!  Just because the message "resembles" noise does not mean
> > it has no information.  In fact, just the opposite.
>
> If you are compressing reliably transmitted pure binary data, then you are
> absolutely right.  But if there is less information per datum, like in an
> analog TV signal, something that resembles noise might very well be noise.

The algorithm is either lossless, or it isn't. Identifying and
removing the noise is
exactly what a lossy algortihm does.
For any lossless algorithm there must be an input that produces an
output at
least the size of the input. The proof is rather simple. You just
count the number of possible
inputs (2^N for N bits) and the number of bits necessary to
distinguish that many different outputs.
(That's N bits, surprise). With less bits two inputs would code to the
same output.

To guarantee compression for a lossless algorithm therefore is only
possible if most input sequences
can't occcur. If you can guarantee that half of the input combinations
can't happen you could
guarantee to save a single bit.

Cases where that occurs is image half toning with a fixed frequency.
In that case you have a worst
case run length distribution .

Kolja Sulimma

On Dec 4, 3:39 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Mon, 03 Dec 2007 18:27:50 +0100 schreef rickman <gnu...@gmail.com>:
>
> > On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
...snip...
> >> Given that uncompressible data often resembles noise, you have to ask
> >> yourself: what would be lost?
>
> > The message!  Just because the message "resembles" noise does not mean
> > it has no information.  In fact, just the opposite.
>
> If you are compressing reliably transmitted pure binary data, then you are
> absolutely right.  But if there is less information per datum, like in an
> analog TV signal, something that resembles noise might very well be noise.

But noise and signal that "resembles noise" are two different things.
You can characterize noise and send a description of it.  But it is
*isn't* noise you have just turned part of your signal into noise.  So
to take advantage of the fact that noise can be compressed by saying
"this is noise" requires that you separate the noise from the signal.
If you could do that, why would you even transmit the noise?  You
wouldn't, you would remove it.

So the only type of "noise like" signal left is the part that *is*
signal and the part that can't be separated from the signal.  Since
you can't distinguish between the two, you have to transmit them both
and suffer the inability to compress them.


> > Once you have a
> > message with no redundancy, you have a message with optimum
> > information content and it will appear exactly like noise.
>
> > Compression takes advantage of the portion of a message that is
> > predictable based on what you have seen previously in the message.
> > This is the content that does not look like noise.  Once you take
> > advantage of this and recode to eliminate it, the message looks like
> > pure noise and is no longer compressible.  But it is still a unique
> > message with information content that you need to convey.
>
...snip...
>
> >> If you can identify the estimated compression beforehand and then split
> >> the stream into a 'hard' part and an 'easy' part, then you have a way to
> >> retain the average.
>
> > Doesn't that require sending additional information that is part of
> > the message?
>
> Usually, yes.

How can you flag the "easy" (compressible) part vs. the "hard" part
without sending more bits?

> > On the average, this will add as much, if not more to
> > the message than you are removing...
>
> Possibly.

As I describe below, compression only saves bits if your *average*
content has sufficient redundancy.  So what does "possibly" mean?

> > If you are trying to compress data without loss, you can only compress
> > the redundant information.  If the message has no redundancy, then it
> > is not compressible and, with *any* coding scheme, will require some
> > additional bandwidth than if it were not coded at all.
>
> > Think of your message as a binary number of n bits.  If you want to
> > compress it to m bits, you can identify the 2**m most often
> > transmitted numbers and represent them with m bits.  But the remaining
> > numbers can not be transmitted in m bits at all.  If you want to send
> > those you have to have a flag that says, "do not decode this number".
> > Now you have to transmit all n or m bits, plus the flag bit.  Since
> > there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
> > bits, I think you will find the total number of bits is not less then
> > just sending all messages with n bits.  But if the messages in the m
> > bit group are much more frequent, then you can reduce your *average*
> > number of bits sent.  If you can say you will *never* send the numbers
> > that aren't in the m bit group, then you can compress the message
> > losslessly in m bits.

Op Mon, 03 Dec 2007 18:27:50 +0100 schreef rickman <gnuarm@gmail.com>:
> On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
>> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
>> <denkedran...@googlemail.com>:
>>
>> > I'm working on a hardware implementation (FPGA) of a lossless  
>> compression
>> > algorithm for a real-time application. The data will be fed in to the
>> > system, will then be compressed on-the-fly and then transmitted  
>> further.
>>
>> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
>> > certain size and start reading data out of the FIFO after a fixed
>> > startup-time. The readout rate will be 1/3 of the input data rate The
>> > size
>> > of the FIFOs is determined by the experimental variance of the mean
>> > compression ratio. Nonetheless there are possible circumstances in  
>> which
>> > no compression can be achieved.
>>
>> Given that uncompressible data often resembles noise, you have to ask
>> yourself: what would be lost?
>
> The message!  Just because the message "resembles" noise does not mean
> it has no information.  In fact, just the opposite.

If you are compressing reliably transmitted pure binary data, then you are  
absolutely right.  But if there is less information per datum, like in an  
analog TV signal, something that resembles noise might very well be noise.

> Once you have a
> message with no redundancy, you have a message with optimum
> information content and it will appear exactly like noise.
>
> Compression takes advantage of the portion of a message that is
> predictable based on what you have seen previously in the message.
> This is the content that does not look like noise.  Once you take
> advantage of this and recode to eliminate it, the message looks like
> pure noise and is no longer compressible.  But it is still a unique
> message with information content that you need to convey.
>
>
>> > Since the overall system does not support
>> > variable bitrates a faster transmission is no solution here.
>>
>> > So my idea was to put the question to all of you what to do in case of
>> > uncompressibility? Any ideas?
>>
>> If you can identify the estimated compression beforehand and then split
>> the stream into a 'hard' part and an 'easy' part, then you have a way to
>> retain the average.
>
> Doesn't that require sending additional information that is part of
> the message?

Usually, yes.

> On the average, this will add as much, if not more to
> the message than you are removing...

Possibly.

> If you are trying to compress data without loss, you can only compress
> the redundant information.  If the message has no redundancy, then it
> is not compressible and, with *any* coding scheme, will require some
> additional bandwidth than if it were not coded at all.
>
> Think of your message as a binary number of n bits.  If you want to
> compress it to m bits, you can identify the 2**m most often
> transmitted numbers and represent them with m bits.  But the remaining
> numbers can not be transmitted in m bits at all.  If you want to send
> those you have to have a flag that says, "do not decode this number".
> Now you have to transmit all n or m bits, plus the flag bit.  Since
> there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
> bits, I think you will find the total number of bits is not less then
> just sending all messages with n bits.  But if the messages in the m
> bit group are much more frequent, then you can reduce your *average*
> number of bits sent.  If you can say you will *never* send the numbers
> that aren't in the m bit group, then you can compress the message
> losslessly in m bits.



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

On Dec 3, 4:14 am, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
> <denkedran...@googlemail.com>:
>
> > I'm working on a hardware implementation (FPGA) of a lossless compression
> > algorithm for a real-time application. The data will be fed in to the
> > system, will then be compressed on-the-fly and then transmitted further.
>
> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> > certain size and start reading data out of the FIFO after a fixed
> > startup-time. The readout rate will be 1/3 of the input data rate The
> > size
> > of the FIFOs is determined by the experimental variance of the mean
> > compression ratio. Nonetheless there are possible circumstances in which
> > no compression can be achieved.
>
> Given that uncompressible data often resembles noise, you have to ask
> yourself: what would be lost?

The message!  Just because the message "resembles" noise does not mean
it has no information.  In fact, just the opposite.  Once you have a
message with no redundancy, you have a message with optimum
information content and it will appear exactly like noise.

Compression takes advantage of the portion of a message that is
predictable based on what you have seen previously in the message.
This is the content that does not look like noise.  Once you take
advantage of this and recode to eliminate it, the message looks like
pure noise and is no longer compressible.  But it is still a unique
message with information content that you need to convey.

> > Since the overall system does not support
> > variable bitrates a faster transmission is no solution here.
>
> > So my idea was to put the question to all of you what to do in case of
> > uncompressibility? Any ideas?
>
> If you can identify the estimated compression beforehand and then split
> the stream into a 'hard' part and an 'easy' part, then you have a way to
> retain the average.

Doesn't that require sending additional information that is part of
the message?  On the average, this will add as much, if not more to
the message than you are removing...

If you are trying to compress data without loss, you can only compress
the redundant information.  If the message has no redundancy, then it
is not compressible and, with *any* coding scheme, will require some
additional bandwidth than if it were not coded at all.

Think of your message as a binary number of n bits.  If you want to
compress it to m bits, you can identify the 2**m most often
transmitted numbers and represent them with m bits.  But the remaining
numbers can not be transmitted in m bits at all.  If you want to send
those you have to have a flag that says, "do not decode this number".
Now you have to transmit all n or m bits, plus the flag bit.  Since
there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1
bits, I think you will find the total number of bits is not less then
just sending all messages with n bits.  But if the messages in the m
bit group are much more frequent, then you can reduce your *average*
number of bits sent.  If you can say you will *never* send the numbers
that aren't in the m bit group, then you can compress the message
losslessly in m bits.

"Boudewijn Dijkstra" <boudewijn@indes.com> writes:
> Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe
> <denkedranjoe@googlemail.com>:
> > I'm working on a hardware implementation (FPGA) of a lossless compression
> > algorithm for a real-time application. The data will be fed in to the
> > system, will then be compressed on-the-fly and then transmitted further.
> >
> > The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> > certain size and start reading data out of the FIFO after a fixed
> > startup-time. The readout rate will be 1/3 of the input data rate
> > The  size
> > of the FIFOs is determined by the experimental variance of the mean
> > compression ratio. Nonetheless there are possible circumstances in
> > which  no compression can be achieved.
> 
> Given that uncompressible data often resembles noise, you have to ask
> yourself: what would be lost?


*Much* more information than if the signal was highly redundant.


> > Since the overall system does not support
> > variable bitrates a faster transmission is no solution here.
> >
> > So my idea was to put the question to all of you what to do in case of
> > uncompressibility? Any ideas?
> 
> If you can identify the estimated compression beforehand and then
> split  the stream into a 'hard' part and an 'easy' part, then you have
> a way to  retain the average.


Yeah, right. And if you juggle the bowling balls when crossing 
the rope-bridge, you'll not break the bridge.


Phil
-- 
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe  
<denkedranjoe@googlemail.com>:
> I'm working on a hardware implementation (FPGA) of a lossless compression
> algorithm for a real-time application. The data will be fed in to the
> system, will then be compressed on-the-fly and then transmitted further.
>
> The average compression ratio is 3:1, so I'm gonna use some FIFOs of a
> certain size and start reading data out of the FIFO after a fixed
> startup-time. The readout rate will be 1/3 of the input data rate The  
> size
> of the FIFOs is determined by the experimental variance of the mean
> compression ratio. Nonetheless there are possible circumstances in which  
> no compression can be achieved.

Given that uncompressible data often resembles noise, you have to ask  
yourself: what would be lost?

> Since the overall system does not support
> variable bitrates a faster transmission is no solution here.
>
> So my idea was to put the question to all of you what to do in case of
> uncompressibility? Any ideas?

If you can identify the estimated compression beforehand and then split  
the stream into a 'hard' part and an 'easy' part, then you have a way to  
retain the average.



-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/