ITG Projects' Overview
LBL IMAGING AND DISTRIBUTED COMPUTING GROUP
Information and Computing Sciences Division
Lawrence Berkeley Laboratory
Berkeley, CA 94720
Publication number: LBL-35352
Credits
The basic technology problems in achieving the high level goals of using
networks to enable large
scale, high speed systems relate to the nature of the underlying network, attaching to that network,
and then moving useful quantities of data in this environment. We have constructed a high speed network
testbed in order to address some of the issues. We have projects in real-time network video that
supports the video microscopy work described later, high speed storage and retrieval of video rate
data streams, protocols for real-time bandwidth and delay guarantees, and the construction and testing
of high speed ATM interfaces. The approach to this work involves a series of industry and academic
collaborations.
The local ATM testbed work has several goals. First we wish to introduce ATM networking into
the laboratory environment. This introduction is not just to promote technology, but rather to fill an
existing need to transport steady-state data streams from lab sensor systems to remote analysis systems
(e.g.
LBL's MasPar
MP-2) and to storage systems. The second goal is to use the technology to
promote routine remote collaboration and intellectual interaction. The third goal is related to the second,
and is enabling the ability to routinely handle various multimedia data streams, up to and including
video. The fourth goal is to prepare the local site infrastructure for the next generations of ESNet.
In the first case the approach is to work with several laboratory instrument (e.g. slow and fast scan
CCD cameras) system to attach directly to an ATM network. There are a number of unanswered questions
relating to system architectures, data transport reliability, and capability.
In the second case, LBL is a participant in the
Bay Area Gigabit Testbed
(BAGNet). This testbed
is a collaboration of the local carrier (Pac Bell) and a collection of commercial and public computer
science organizations. The testbed is a metropolitan area ATM network with about a dozen sites connected.
The applications work will focus on multi-way tele-seminar, large scale distributed storage
systems, and remote collaboration.
The approach to the third goal is to employ a high speed gateway connected to the local ATM network
(which is also the interface to the BAGNet testbed) to several Laboratory networks. The gateway
function will be that of video and other multimedia format conversion so that multimedia
streams can be made available to a variety of platforms.
The fourth goal of promoting local ATM networks is largely addressed by doing the first three.
The current state of this project is that much of the infrastructure (fiber runs, workstations, ATM
switch, and high speed gateway) are either deployed, or in the process of deployment. Most of the
actual work and experiments will be done next year.
Our applications strategy is that distributed high performance applications are partitioned and
mapped onto computational elements that are optimal for the different parts of the software. Networks
are then used to combine these elements with networks based data input, storage, and output
devices to assemble a complete distributed system. All of these elements must communicate data
among themselves with high bandwidth and low overhead.The ability to configure these "virtual"
systems is based on using high speed networks and interprocess communication mechanisms to associate
dispersed resources, and is the approach that will enable scalable, reusable, and easily reconfigured
software that addresses large scale problems. We have done considerable work in this area in
terms of distributing applications across multiple supercomputers and workstations; across multiple
workstations ("clusters"), and across scalar and massively parallel systems. This work has lead to
considerable insight into the real issues of high speed distributed computing. Much of the current
focus of our work is in trying to incorporate an MPP system into a "real" high speed distributed application
(the several video microscope projects).
The world of microscopic scale offers numerous opportunities to make revolutionary changes to
laboratory technique because of the impossibility of control based on conventional sensing. For real-time
microscopic processes, video imaging offers one of the few methods for sensing the environment,
and, when changes in the control mechanisms of the environment are required, hand or even
preprogrammed control are rarely effective. There are several essential components needed to address
the problem of micro-control: sensing (almost always by imaging through the microscope), extraction
of information from the images, and mechanisms of controlling the environment. The imaging is usually
tuned to the particular application, but results in a stream of images (since a dynamic environment
is what is of interest). Extraction of information from the images provides us with shape and
position information in order to gauge the effect of some external force. This is addressed in our work
in Computer Vision for Scientific Image Analysis. The final aspects are data collection, handling,
computation, and control, and these are addressed as aspects of our various network testbeds.
It is not sufficient to tell the scientific community that "we have developed high speed networks,
and they will solve laboratory problems". Real scientific problems have to be successfully addressed
in order to demonstrate that investment of scarce scientific resources in this technology will provide
new and potentially revolutionary approaches to problems. Automated image\x11analysis for information
content is a critical issue in making the combination of high speed networks and high speed computing
provide revolutionary, as opposed to evolutionary, new approaches to problems.
It is also the case that this technology is a key element of similar problems in different fields. For
example, inspection and fabrication of micro electromechanical and nano-scale devices will almost
certainly require a similar technology. The ability to scan large digital libraries for similarity analysis
will impact many fields that use reference image databases (for example pathology and radiology).
This general problem has been the subject of much research over the past twenty years, and we do
not claim a general solution. However we have made significant progress, especially in coupling
knowledge based image analysis systems to sources of video, and using the results for micro manipulation.
The Image Server System (ISS) represents a basic technology for digital image library systems.
The requirement is that many users can be provided with images or image streams fast enough to satisfy
their applications. This access rate could be relatively slow for single small images, faster
(mbytes/sec) for large images (e.g. radiology or image plate scanner), faster still (~ 10 - 12 mbytes/
sec) for roaming very large tiles images, or viewing video clips.
The goals of the design are to provide the functionality noted above, to have a system architecture
that is inherently scalable, and to have an implementation that is both scalable and economic. Our
approach is to is to have network based servers that operate in parallel to satisfy requests for image
data. This provides bandwidth scalability through a design that scatters pieces of images (or frames of
video sequences) across many servers. A single request is satisfied by several servers, thereby allowing
the network to logically aggregate multiple low bandwidth streams into a single high bandwidth
stream at the application. This satisfies the goal of scalable bandwidth. (It should be noted that several
of the optimizations involved are due to the characteristics of image data structure, and this approach
is not necessarily useful for general data.) In a similar fashion, the capacity of the overall system is
scalable by adding disks and / or servers. The basic approach to the implementation is to use relatively
inexpensive workstations as servers, matching the disk capacity to the network throughput of
the server.
The initial target application for the ISS is for use with the ARPA MAGIC (Multidimensional
Applications and Gigabit Internetwork Consortium) gigabit network testbed project. MAGIC is
building a terrain visualization application, known as TerraVision, that will allow a user to view and
navigate through (and over) a landscape created from aerial images. TerraVision requires large
amounts of data, transferred at both bursty and steady rates, and has network throughput as its major
limiting factor. The implementation of the design of the ISS for use with TerraVision is essentially
complete, and preliminary testing indicates that the approach is valid. As the servers are deployed in
the MAGIC testbed, larger scale tests will further stress the design.
Healthcare systems are an interesting prototype for the NII, since they have essentially all of the
characteristics of most, if not all, NII applications. In particular, they have all of the characteristics of
image based information environments:
- large scale, distributed, multimedia, federated databases (patient record systems)
- representations of knowledge and protocols
- security and privacy
- accounting and billing
- interface to laboratory systems (auto acquisition of lab results)
- distributed instrumentation (tele-medicine)
- high bandwidth, high availability image servers (tele-radiology)
- high reliability, large scale archival storage
- widespread tele-education for mandatory life long learning
- heterogeneous telecommunications access
- digital libraries
- high payoff / quality of life
Our activities in this area involve leading the design effort for an integrated tele-medicine project
in Kansas City, and participation on several sub-committees of the MCC lead Hospital Information
Systems Consortium.
This work provides for both technology transfer of our expertise in distributed computing and digital
images library systems, and the opportunity to see how our approach will integrate into a large
scale application domain.
It is our belief that image-based applications will play a major role in the use of computing and
communications systems in the next century. This class of applications will be key elements in everything
thing from scientific research to group work environments to entertainment to education. For many of
the same reasons that video games have a near universal appeal for children (in that they present a
visually complex and challenging activity), the use of image-based technology will become increasingly
important in education. For example, for students to capture, create, manipulate, and store their
own images through a "digital darkroom" system is not only interesting, but teaches a whole range of
concepts about digital imaging and database technology. We also believe that this is the kind of
approach that will have a more universal appeal across skill levels and cultural backgrounds than
more traditional approaches to introducing computing technology.
We have worked with several education groups and put together the imaging technology aspects of
an overall proposal, and we have contributed to the DOE strategy for technology in education.
We have worked with high school teachers and students to put together several prototypes to
explore the issues of imaging technology applied to education. The
"Whole Frog" project
has resulted
in a high resolution 3D image data set for a frog. Several teachers and anatomy students segmented
(identified) eighteen anatomical systems, and build
geometric models
for these. This "knowledge
base" has been represented by
MPEG video clips,
"QuickTime" clips, and was turned into an interactive multimedia
teaching unit by three high school teachers.
Administrative Information
Ownership and Revision History
Group leader William E. Johnston, johnston@george.lbl.gov, is responsible for this WWW document.
This page is located at http://www-itg.lbl.gov/~jason/cv.support/AR.93.Sum.html and was last updated Tuesday, 24-May-2005 11:16:12 PDT.
[an error occurred while processing this directive]