The actual “capture” presents no problem. Electron guns
cover the rows of pixels at 25 sweeps per second,
refreshing and changing intensities. 28 gray values
make feature extraction the challenge. Efficient
decision trees
collapse in the combinatorial explosion.
Digitizing brightness variation will reveal
reflective properties, texture and compositions.
The attractively human stereo systems still are troubled
by the correspondence problem: one wrench seen twice is two;
they cannot correct for viewpoint. Structured light,
though it can slip
into the angle between steel sheets, is vulnerable
to shadows. In windowing, shifts in resolution
distribute the labor. The master scans the field
coarsely, for promising features—intrusions,
protrusions, holes—
then moves in slaves for a detailed investigation.
A PUMA at the University of Rhode Island
now solves several bin-picking problems. It quickly selects
for graspability, retrieving a sequence of parts
from an overlapping mass. The arm’s parallel jaws
know when a part is between them, and close around it gently.
A bottom-up system learns like a newborn. The first
flexible net,
SOPHIA, used 12 SLAMS and had only one discriminator.
The commercially available WISARD has no state structure,
yet it achieves 100% discrimination
among target faces, through its powers of generalization.
Trained and tested on live images, it is not disconcerted
by changes in light, spectacles, grimaces
or false mustaches. Even a single-layer net displays
certain features of intentionality: the sudden
catastrophic leap from estimate to decision, when
the first burst of feedback confirms recognition
of a dubious pattern. The high-resolution window
homes in on the key feature, without camera-shake: the dot
or cross-bar blown up on top of the vertical slash,
the suspect’s face, framed and magnified over the crowd.