Издательство Elsevier, 2000, -426 pp.
The design and implementation of an complete artificial vision system represents a
daunting challenge. The Computer Vision research community has been working on this problem for over twenty five years, and we can point to significant contributions in a number of areas. The gap between the state of the art and the goal is still wide. The main reason why we are not progressing any faster is that, quite simply, we do not know how to proceed, because we are not able to cleanly decompose and express the sub-problems to be addressed. We do not have a road map with clearly landmarks to refer to. The classification of approaches into low, medium, and high level vision, has many drawbacks:
- low level modules, such as edge detectors, produce primitives which "should be corrected and improved by higher level modules".
- high level modules, such as 3-D shape inference, or behavior analysis, work remarkably well on perfect data, but degrade abruptly with real data.
- mid level modules are supposed to bridge the gap between low and high levels, and as such, get a long list of tasks, and a good share of the blame for failure.
The one and only complete computational theory of Computer Vision can be found in the pioneering work of David Marr [55]. It has served as a guiding light for many students and researchers, defining terms, identifying issues, and suggesting solutions. It is now showing its limitations, and current research results are rarely presented in the context of the Marr theory.
This book represents a summary of the research we have been conducting since the early 1990s, and describes a conceptual framework which addresses some current shortcomings, and proposes a unified approach for a broad class of problems. While the framework is defined, our research continues, and some of the elements presented here will no doubt evolve in the coming years. Why, then, choose to write it now?
In part, because the results are encouraging enough to be presented today, but also because it is the proper way to convey a unified picture, an aspect which often gets lost in individual papers.
This book is not intended as a textbook, although it could be used as a complement to existing textbooks. It is organized in eight chapters. In the Introduction chapter, we present the definition of the problems, and give an overview of the proposed approach and its implementation. In particular, we illustrate the limitations of the 2.5D sketch, and motivate the use of a representation in terms of layers instead. In chapter 2, we review some of the relevant research in the literature. The discussion focuses on general computational approaches for early vision, and individual methods are only cited as references. Chapter 3 is the fundamental chapter, as it presents the elements of our salient feature inference engine, and their interaction. It introduces tensors as a way to represent information, tensor fields as a way to encode both constraints and results, and tensor voting as the communication scheme. Chapter 4 describes the feature extraction steps, given the computations performed by the engine described earlier. In chapter 5, we apply the generic framework to the inference of regions, curves, and junctions in 2-D. The input may take the form of 2-D points, with or without orientation. We illustrate the approach on a number of examples, both basic and advanced. In chapter 6, we apply the framework to the inference of surfaces, curves and junctions in 3-D. Here, the input consists of a set of 3-D points, with or without as associated normal or tangent direction. We show a number of illustrative examples, and also point to some applications of the approach. In chapter 7, we use our framework to tackle 3 early vision problems, shape from shading, stereo matching, and optical flow computation. In chapter 8, we conclude this book with a few remarks, and discuss future research directions.
Previous Work
The Salient Feature Inference Engine
Feature Extraction
Feature Inference in 2-D
Feature Inference in 3-D
Application to Early Vision Problems
A: Tensor analysis
B: Details of the Marching Algorithms
C: Software Systems