Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Discussion Groups

Discussion Groups | FPGA-CPU | Ideas For Array Processor

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

Ideas For Array Processor - rtstofer - Mar 17 13:27:00 2005



I am looking into the idea of a maze solving robot. The flood fill
algorithm (see
http://micromouse.cannock.ac.uk/maze/fastfloodsolver.htm) is simple
enough but takes a lot of time when serially programmed on a
microprocessor.

So, I was thinking about an array of 256 cells in an FPGA where the
value in the cell was one more than the least of its accessible
neighbors. This implies that the cell gains knowledge of bordering
walls (thus accessibility) and it would appear that the algorithm
would be asynchronous in terms of the number of cycles until
completion but with a finite limit.

Driving the bot (stepper motor, sensors, etc.) could be done with
the FPGA (or not) but it would seem to me that 256 intelligent cells
would solve the maze faster than a more typical microprocessor. Few
of these bots use Pentium processors - primarily because of power
and weight which leading to inertia issues.

Any thoughts or pointers on this idea? I imagine it has already
been done for some application such as Martin Gardner's game of
Life. For all I know, it has been done for this application as
well - I just haven't run across it yet.

Thanks!
Richard




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Ideas For Array Processor - John Kent - Mar 18 8:11:00 2005

Hi Richard,

I worked on a similar problem at CSIRO finding fast skeletons of blobs.
What I proposed was a perimeter traverser that performed one pixel erosion
of each binary blob in an image. I also looked at a cellular automata
approach
for solving watershed transforms, but decided that most of the logic lay
around
dormant most of the time ....

What you might look at is a perimeter traverser on a single pixel trace
used to represent
the maze. The trace only needs to be one pixel wide separated by one pixel.
You can speed the algorithm by having a bit map of each adjacent pixel
(8 way connect)
in each byte. What you do then is have a 1 of 8 direction map and vector.

7 0 1
6 x 2
5 4 3

You draw a line at 270 degrees to you direction vector. You then test
the pixels
clockwise until you find the first set pixel. That will be your new
direction vector.
You then advance one pixel in that direction and repeat the process.

Eg. Lets say you are at "x" heading in direction "1". You draw a line
out to "7"
then rotate clockwise until you hit the first set pixel, that may be at
"0" for arguments
sake. You then advance to "0" and your direction is "0".

This is akin to having a barrel shifter rotating the bits in the byte by
1 bit (direction 1)
to rotate 270 degrees yo add 6 to the barrel shifter control. You then
use a priority
encoder to determine which bit is the first one set on the output of the
barrel shifter.

When you end up back at your starting point you have traversed the
perimeter
of the object. When you reach a dead end in the maze, you will rotate
180 degrees
and go in the oposite direction. The traverser will always favour going
to the left most
branch of the maze.

John. rtstofer wrote:

>I am looking into the idea of a maze solving robot. The flood fill
>algorithm (see
>http://micromouse.cannock.ac.uk/maze/fastfloodsolver.htm) is simple
>enough but takes a lot of time when serially programmed on a
>microprocessor.
>
>So, I was thinking about an array of 256 cells in an FPGA where the
>value in the cell was one more than the least of its accessible
>neighbors. This implies that the cell gains knowledge of bordering
>walls (thus accessibility) and it would appear that the algorithm
>would be asynchronous in terms of the number of cycles until
>completion but with a finite limit.
>
>Driving the bot (stepper motor, sensors, etc.) could be done with
>the FPGA (or not) but it would seem to me that 256 intelligent cells
>would solve the maze faster than a more typical microprocessor. Few
>of these bots use Pentium processors - primarily because of power
>and weight which leading to inertia issues.
>
>Any thoughts or pointers on this idea? I imagine it has already
>been done for some application such as Martin Gardner's game of
>Life. For all I know, it has been done for this application as
>well - I just haven't run across it yet.
>
>Thanks!
>Richard >
>
>To post a message, send it to:
>To unsubscribe, send a blank message to:
>Yahoo! Groups Links --
http://members.optushome.com.au/jekent




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Ideas For Array Processor - rtstofer - Mar 18 11:40:00 2005



Hi John,

That approach sounds like a wall-hugger. The mazes are designed by
intent such that a wall-hugger will fail. I think they want the
students to do a little more work designing the hardware.

If we look at the first state of a 5x5 matrix with the goal at the
center, it will look like this:

255 255 255 255 255
255 255 255 255 255
255 255 000 255 255
255 255 255 255 255
255 255 255 255 255

Assuming no walls, the next iteration looks like:

255 255 255 255 255
255 255 001 255 255
255 001 000 001 255
255 255 001 255 255
255 255 255 255 255

Four cells changed during the iteration. The next iteration looks
like:

255 255 002 255 255
255 002 001 002 255
002 001 000 001 002
255 002 001 002 255
255 255 002 255 255

Eight cells changed. And so on until it fills the 16 x 16 matrix.
Each cell has a value 1 more than its minimum neighbor, like this:

004 003 002 003 004
003 002 001 002 003
002 001 000 001 002
003 002 001 002 003
004 003 002 003 004

Absent walls, the distance to the goal and the direction to turn are
all calculated.

It gets interesting when walls are added because the value becomes
that of the minimum accessible neighbor + 1. This process can be
likened to (and is named for) pouring water in the goal square and
measuring how far it travels as it reaches each square on the board.

Obviously, if the maze was a spiral the values would be increasing
by 1 around the path of the maze from the goal back to the starting
point. In the end, after the goal has been reached, a minimum path
from the origin has been found. It may not be THE minimum because
some cells were never tested for the presence of walls - the robot
discovers these as it moves along. So, on the way back home for the
next run, the robot does a little more exploration and continues to
solve the matrix because potential paths that have not been explored
will appear to be shorter than the path the bot took getting to the
goal. On the second run through the maze the bot should have a
minimum path.

On my example maze there are two minimum paths. One thing that
needs to be added is a tree search to find the minimum path in terms
of minimizing the number of turns. That should be pretty easy and
can be done at any time before starting the second pass. It is not
a real time requirement.

My idea is to create cellular automata that set their value to the
minimum of their accessible neighbors (once I figure out how to
calculate a minimum of four values in hardware) and step through the
update process until no further changes occur. Kind of like the
method of residuals I played with in a heat transfer class.

I thought that solving the matrix in parallel would result in an
update in, at most, 256 cycles for the 16x16 array and that perhaps
the cycle time could be a few tens of nanoseconds. So, maybe a few
microseconds for an update. The maze solution would not be a
limiting factor in the speed of the bot.

Richard --- In , John Kent <jekent@o...> wrote:
> Hi Richard,
>
> I worked on a similar problem at CSIRO finding fast skeletons of
blobs.
> What I proposed was a perimeter traverser that performed one pixel
erosion
> of each binary blob in an image. I also looked at a cellular
automata
> approach
> for solving watershed transforms, but decided that most of the
logic lay
> around
> dormant most of the time ....
>
> What you might look at is a perimeter traverser on a single pixel
trace
> used to represent
> the maze. The trace only needs to be one pixel wide separated by
one pixel.
> You can speed the algorithm by having a bit map of each adjacent
pixel
> (8 way connect)
> in each byte. What you do then is have a 1 of 8 direction map and
vector.
>
> 7 0 1
> 6 x 2
> 5 4 3
>
> You draw a line at 270 degrees to you direction vector. You then
test
> the pixels
> clockwise until you find the first set pixel. That will be your
new
> direction vector.
> You then advance one pixel in that direction and repeat the
process.
>
> Eg. Lets say you are at "x" heading in direction "1". You draw a
line
> out to "7"
> then rotate clockwise until you hit the first set pixel, that may
be at
> "0" for arguments
> sake. You then advance to "0" and your direction is "0".
>
> This is akin to having a barrel shifter rotating the bits in the
byte by
> 1 bit (direction 1)
> to rotate 270 degrees yo add 6 to the barrel shifter control. You
then
> use a priority
> encoder to determine which bit is the first one set on the output
of the
> barrel shifter.
>
> When you end up back at your starting point you have traversed the
> perimeter
> of the object. When you reach a dead end in the maze, you will
rotate
> 180 degrees
> and go in the oposite direction. The traverser will always favour
going
> to the left most
> branch of the maze.
>
> John. > rtstofer wrote:
>
> >I am looking into the idea of a maze solving robot. The flood
fill
> >algorithm (see
> >http://micromouse.cannock.ac.uk/maze/fastfloodsolver.htm) is
simple
> >enough but takes a lot of time when serially programmed on a
> >microprocessor.
> >
> >So, I was thinking about an array of 256 cells in an FPGA where
the
> >value in the cell was one more than the least of its accessible
> >neighbors. This implies that the cell gains knowledge of
bordering
> >walls (thus accessibility) and it would appear that the algorithm
> >would be asynchronous in terms of the number of cycles until
> >completion but with a finite limit.
> >
> >Driving the bot (stepper motor, sensors, etc.) could be done with
> >the FPGA (or not) but it would seem to me that 256 intelligent
cells
> >would solve the maze faster than a more typical microprocessor.
Few
> >of these bots use Pentium processors - primarily because of power
> >and weight which leading to inertia issues.
> >
> >Any thoughts or pointers on this idea? I imagine it has already
> >been done for some application such as Martin Gardner's game of
> >Life. For all I know, it has been done for this application as
> >well - I just haven't run across it yet.
> >
> >Thanks!
> >Richard
> >
> >
> >
> >
> >
> >
> >
> >To post a message, send it to:
> >To unsubscribe, send a blank message to: fpga-cpu-

> >Yahoo! Groups Links
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> --
> http://members.optushome.com.au/jekent





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Re: Ideas For Array Processor - John Kent - Mar 18 22:26:00 2005

Hi Richard,

rtstofer wrote: >Hi John,
>
>That approach sounds like a wall-hugger. The mazes are designed by
>intent such that a wall-hugger will fail. I think they want the
>students to do a little more work designing the hardware. >
I assume you mean they have loops in the maze or its a minimum path
problem as well
as a maze solving problem.

>If we look at the first state of a 5x5 matrix with the goal at the
>center, it will look like this: >
snip... to save on email size.

>On my example maze there are two minimum paths. One thing that
>needs to be added is a tree search to find the minimum path in terms
>of minimizing the number of turns. That should be pretty easy and
>can be done at any time before starting the second pass. It is not
>a real time requirement. >
>My idea is to create cellular automata that set their value to the
>minimum of their accessible neighbors (once I figure out how to
>calculate a minimum of four values in hardware) and step through the
>update process until no further changes occur. Kind of like the
>method of residuals I played with in a heat transfer class. >
The most efficient sorting network is probably a Batcher Sorting Network.
You can find references to it on the web. If you are just finding a
minima or maximas
it probably works out to be a tree of two input minima / maxima
functions ...
ie a comparitor and multiplexer that selects the minimum of two vaules.
for a 4 way connect it will take two cycles, finding two minima in the
first cycle
then finding the minima of the results in the second stage.

If the values were being sorted serially, there is a design by Arc and
Waters
(I think thats their name) that designed a shift register with a
parallel comparitor
array that shifted the register up or down to insert a value into the
shift register.
I came up with such a design a few years ago only to find in a literature
search that they had done the same thing in 1984. (Rats !)

If you number the iterations of your flooding routine, you are
basically doing a
distance transform. I assume what you are doing is following a gradient
to find
your path through the maze. Sounds OK, but then your distance transform
will have a wave front to follow for large empty regions.

The watershed transform I mentioned is used for finding boundaries of
flooded
regions. You seed the local minimas in the image then increase the water
mark
by one pixel on each pass. For each increase in the water mark, the
neighbouring
pixels which are at that water mark are flooded out. When two regions merge,
it forms a watershed boundary.

>I thought that solving the matrix in parallel would result in an
>update in, at most, 256 cycles for the 16x16 array and that perhaps
>the cycle time could be a few tens of nanoseconds. So, maybe a few
>microseconds for an update. The maze solution would not be a
>limiting factor in the speed of the bot. >
Do you have to start the maze flood at any particular point ?
Can you flood it from the start and end and find the minimum maxima, of
the distance transform ?
ie. where the start flood meets the end flood.

>Richard >
John.

--
http://members.optushome.com.au/jekent




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Ideas For Array Processor - rtstofer - Mar 18 23:32:00 2005



John,
> I assume you mean they have loops in the maze or its a minimum
path
> problem as well
> as a maze solving problem.

Yes. The only 'rule' is that at each intersection there is a post
and there will be at least one wall connected to each post. Islands
do indeed exist. The only time the rule is broken is at the center
target. Here 4 squares in the center of the board may, or may not
have a center post (taller than others, I believe) that the
competitor may ask to have removed. Clearly the bot has to keep
track of its position in the matrix.

> >My idea is to create cellular automata that set their value to
the
> >minimum of their accessible neighbors (once I figure out how to
> >calculate a minimum of four values in hardware) and step through
the
> >update process until no further changes occur. Kind of like the
> >method of residuals I played with in a heat transfer class.
> >
> The most efficient sorting network is probably a Batcher Sorting
Network.
> You can find references to it on the web. If you are just finding
a
> minima or maximas
> it probably works out to be a tree of two input minima / maxima
> functions ...
> ie a comparitor and multiplexer that selects the minimum of two
vaules.
> for a 4 way connect it will take two cycles, finding two minima in
the
> first cycle
> then finding the minima of the results in the second stage.

I have written the VHDL for a single cell and it looks like it will
take about 6 nS to calculate. I now need to generate the complete
matrix with the interconnections. Maybe next week.

Even if I allow for a 10nS clock period and always iterate 256
times, the total calculation would take 2.6 uS. Very, very fast! > If the values were being sorted serially, there is a design by Arc
and
> Waters
> (I think thats their name) that designed a shift register with a
> parallel comparitor
> array that shifted the register up or down to insert a value into
the
> shift register.

I'll go look for this.

> I came up with such a design a few years ago only to find in a
literature
> search that they had done the same thing in 1984. (Rats !)

It seems there is nothing new under the sun. Some folks in Germany
just used an FPGA for a 5000x speedup of a large cellular automata
versus some type of PC. I believe they were going to study blood
flow in arteries. > If you number the iterations of your flooding routine, you are
> basically doing a
> distance transform. I assume what you are doing is following a
gradient
> to find
> your path through the maze. Sounds OK, but then your distance
transform
> will have a wave front to follow for large empty regions.

Indeed there is a wave front and this is what allows the algorithm
to take advantage of parallelism. Each cell can compute its' new
value based only on the values of its' neighbors. Now, of course,
the neighbor values may not be stable at that iteration so I see two
approaches. Iterate 256 times no matter what or have each cell post
a status regarding a change in state at the end of the iteration. A
256 input 'or' gate will indicate the need for further iterations.
My VHDL anticipates using this approach. > The watershed transform I mentioned is used for finding boundaries
of
> flooded
> regions. You seed the local minimas in the image then increase the
water
> mark
> by one pixel on each pass. For each increase in the water mark,
the
> neighbouring
> pixels which are at that water mark are flooded out. When two
regions merge,
> it forms a watershed boundary.

Yes, in this case the walls can form long dead-end corridors and the
amount of flooding necessary to reach the end is a measure of the
distance. > >I thought that solving the matrix in parallel would result in an
> >update in, at most, 256 cycles for the 16x16 array and that
perhaps
> >the cycle time could be a few tens of nanoseconds. So, maybe a
few
> >microseconds for an update. The maze solution would not be a
> >limiting factor in the speed of the bot.
> >
> >
> >
> Do you have to start the maze flood at any particular point ?
> Can you flood it from the start and end and find the minimum
maxima, of
> the distance transform ?
> ie. where the start flood meets the end flood.

I have written a simulation in Borland Pascal (under DOS, no less).
For the first pass the robot starts in a corner (a constant for a
given competition) and proceeds to the center 4 squares. In
general, there will only be one opening into this block. So, for
the first run where the robot is trying to get to the objective, the
center square(s) are initialized to 0 and the flood proceeds outward
until all the squares are calculated and stable given the current
knowledge of walls. The value in the cell containing the robot is
the distance remaining to the objective given the lack of knowledge
of further walls.

Having reached the objective (and this is guaranteed if there is a
path) the flooding now begins at the original starting point. This
will now produce a least distance path back to the beginning. In
most cases this path will be different than the path coming in
because certain areas were assumed to not have walls. These areas
were missed during the run in but will, in all likelyhood, have a
lesser calculated distance than the first path because walls are
unknown. That will change as the bot traverses the area returning
to the origin.

Anyway, on the way back to the origin, the robot encounters more
walls and adds these to the matrix.

Once back at the origin, the matrix is again flooded from the center
outward and this time the path is very probably a minimum distance
because the important walls are now known.

But, there may be more than one minimum distance path. It is now an
excercise to compute the path as a tree, calculating the cost of the
various alternatives. Long straight runs are cheap, short runs with
turns are expensive. This evaluation can be made if there is more
than one path discovered.

Not every path has been evaluated in just one run in and return
home. But, it may not matter. All potential paths have been
evaluated as though walls didn't exist and that is the shortest they
could possibly be.

It's fun to watch the simulation run!

By the way, these folks use sensors above the walls so they can not
only detect the immediately adjacent walls, they can also spot the
perpendicular walls just beyond. More knowledge is better. Oh, and
cameras have been tried - they did not succeed.

It is estimated that the final bot must be able to travel at 1.5 m/s
peak and average 1.0 m/s including turns. And the maze constructors
give no consideration to the coefficient of friction for the
surface. You have to run what you find! Totally autonomous and
battery powered for at least 10 minutes.

I am looking at putting the CA in an FPGA with the T80 core. The
256 cell values could look like memory to the Z80 and the walls, as
they are discovered by the Z80, could be bits in a memory mapped
register. The Z80 would take care of driving the bot and reading
the sensors. Stepper motor logic could be done in the FPGA as
well. Hang a compact flash on the controller and run CP/M! I think
I have done that before...




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )