.\" @(#) $Header: vat.1,v 1.7 94/06/01 18:37:20 mccanne Exp $ (LBL)
.\"
.\" Copyright (c) 1992-1994
.\" The Regents of the University of California.  
.\" All rights reserved.  
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that: (1) source code distributions
.\" retain the above copyright notice and this paragraph in its entirety, (2)
.\" distributions including binary code include the above copyright notice and
.\" this paragraph in its entirety in the documentation or other materials
.\" provided with the distribution, and (3) all advertising materials mentioning
.\" features or use of this software display the following acknowledgment:
.\" ``This product includes software developed by the University of California,
.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
.\" the University nor the names of its contributors may be used to endorse
.\" or promote products derived from this software without specific prior
.\" written permission.  
.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  
.\"
.TH VAT 1  "17 Feb 1992"
.SH NAME
vat \- X11-based audio teleconferencing tool
.SH SYNOPSIS
.na
.B vat
[
.B \-aAcdEjJklLMnnsSv
] [
.B \-F
.I device
] [
.B \-f
.I audiofmt
]
.br
[
.B \-g
.I geometry
] [
[
.B \-m
.I host/port[/fmt]
] [
.B \-N
.I sessionname
]
.br
[
.B \-C
.I conferencename
] [
.B \-K
.I key
] [
.B \-P
.I priority
] [
.B \-t
.I ttl
] [
.B \-U
.I socket
] [
.B \-u
.I script
] [
.I dest
]
.br
.ad
.SH DESCRIPTION
.LP
.I Vat
allows users to conduct host-to-host or multihost audio teleconferences
over an internet (multihost conferences require that the kernel support
IP multicast).  On most architectures, no hardware other than a microphone
is required \(em sound I/O is via the built-in audio
hardware.  On DEC systems, an AudioFile server must be running.
.SH OPTIONS
(Note that all the parameters set by
the following flags can also be controlled by X resources
(which all have `reasonable' defaults)
so one should not need to give
.I vat
any flags in the usual case.  The flags only exist to temporarily
override some resource.)
.TP
.B \-a
Enable automatic gain control on the output (speaker).
.TP
.B \-A
Enable automatic gain control on the input (mike).
.TP
.B \-c
Start up in `conference mode' (see description below).  (This flag
is the opposite of
.BR \-l .)
.TP
.B \-C
Use 
.I conference
as the title of this vat window.  If no \-C flag is given, the
destination address and port are used as the window title.  
.TP
.B \-d
Start up with the VU meters disabled (this can be changed using the
.B "`Disable Meters'"
checkbox on the auxiliary controls panel.  
.TP
.B \-E
Include a checkbox in the auxiliary controls panel for specifying that
echo cancellation is being performed externally (i.e., in hardware).
This option can also be effected by setting the X resource
.I Vat.externalEchoCancel
to ``true''.
.TP
.B \-f
Sets the output audio packet format to
.I fmt.  
(Note that it not necessary to set an input format since vat will accept
.I any
packet format it knows about.)  Currently available audio formats are:
.nf
	pcm	78Kb/s 8-bit mu-law encoded 8KHz PCM (20ms frames)
	pcm2	71Kb/s 8-bit mu-law encoded 8KHz PCM (40ms frames)
	pcm4	68Kb/s 8-bit mu-law encoded 8KHz PCM (80ms frames)
	dvi	46Kb/s Intel DVI ADPCM (20ms frames)
	dvi2	39Kb/s Intel DVI ADPCM (40ms frames)
	dvi4	36Kb/s Intel DVI ADPCM (80ms frames)
	gsm	17Kb/s GSM (80ms frames)
	lpc4	9Kb/s Linear Predictive Coder (80ms frames)
.fi
(The
.I dvi
encoding was contributed by Jack Jansen of CWI.  
The
.I gsm
coder was contributed by Carsten Bormann
of the Technische Universitaet Berlin.  
The
.I lpc
coder was contributed by Ron Frederick of Xerox PARC.)
.sp .5
The default
audio format can be set with the
.I audioFormat
X resource.  It defaults to
.B pcm.  
.TP
.B \-g
Override the default window geometry with
.I geometry, 
which should be a standard X geometry string.
.TP
.B \-j
Start up with audio output to the external audio jack.  
This flag can be defaulted by setting the X resource
.I Vat.speaker
to false.  
.TP
.B \-J
Start up with audio output muted.
.TP
.B \-k
Keep all sites in the `Conference Hosts' panel.  Normally sites that
exit are deleted from the panel.  With
.B \-k,
sites that exit are grayed-out but not deleted.  
.TP
.B \-K
Use 
.I key
as the encryption key for this vat session.
.TP
.B \-l
Start up in `lecture mode' (see description below).  
This flag can be defaulted by setting the X resource
.I Vat.lectureMode
to true.  
.TP
.B \-m
Serve as a `mixer' for host
.I host
which is listening on port
.I port
optionally using audio packet format
.I fmt.  
The mix function allows vat to function as `application level gateway'.  
There are three uses for this function:
.sp .5
.I "Bandwidth Adaptation"
.br
Say you want to participate in a conference at work that is using 64Kb/s
PCM but you to join over the 9.6Kb/s line to your home.  If there is a
machine "work" at work that can proxy for your machine "home" at home,
this could be set up as:
.nf
	At work:  vat -m home/5678/lpc4
	At home:  vat work/5678/lpc4
.fi
.sp .5
.I Concentration
.br
If a conference has two communities connected by a low speed link,
it's desirable to mix multiple conversations into a single data
stream rather than shipping them over the link individually.  
If, say, the gateway machines were "foo" and "bar" and the
conference were on the default multicast address, this would
be set up as:
.nf
	On foo:  vat -m bar/5678
	On bar:  vat -m foo/5678
.fi
(Note that the port used by the mixers has to be different than
the port used by the conference.)
.sp .5
.I "Format Conversion"
.br
Say organization A uses Etherphone audio packet format for internal
conferencing but wants to have a joint conference with organization
B that uses G.278 packet format.  
If a machine AtoB at A can be used
to convert/concentrate traffic to B via a peer BtoA, the two
conferences can be combined with:
.nf
	on A's AtoB:  vat -m BtoA/5678/pcm Aaddr/Aport/etherphone
	on B's BtoA:  vat -m BtoA/5678/pcm Baddr/Bport/g278
.fi
Note that the two conference environments need have
.I nothing
in common --- they can use different audio formats, addresses and/or ports.  
The only agreement needed is over the peer-to-peer communication
between AtoB and BtoA.  
.TP
.B \-M
Start up with audio input unmuted.
.TP
.B \-n
Force `native' vat packet format and default address.  (This flag
is the opposite of
.BR \-v .)
.TP
.B \-N
Use 
.I session,
in lieu of your user name and local host,
to identify you to other sites.  
If no \-N flag is given, the X resource Vat.sessionName is used.
.TP
.B \-P
Use 
.I priority
as this vat window's priority for obtaining the audio device.
All vat windows have a priority which is typically set by the
X resource Vat.defaultPriority (defaults to 100) but this can be
overridden by a
.B \-P
flag.  If a window requests the audio
(because new network data arrived or the mike has been unmuted)
and the window currently holding the audio is either lower priority
or hasn't used audio for Vat.idleHoldTime seconds, the audio holder
immediately gives it up.  Otherwise the new window's request is
ignored.  (Vat.idleHoldTime provides hysteresis to prevent `thrashing'
when two conferences go active at about the same time; the priority
provides a way to distiguish `background' windows, say a radio
station broadcast or a `directory' window of people reachable via
vat, from `foreground' activity like a particular audio conference
so vat can make better decisions on what should get the audio.)
.TP
.B \-s
Start up with audio output to the internal speaker.  
(This flag is the opposite of
.BR \-j .)
.TP
.B \-S
Make new sites come up `suppressed' (the check box next to the
sitename will be checked and you will have to click on it to hear
the site speak).  This flag is intended for something like meeting
audiocasts where a moderator wants to have control over who is able to
speak.  This flag can also be set by the Vat.muteNewSites X resource.
.TP
.B \-t
Set the multicast ttl (time-to-live) to
.I ttl.  
(The ttl is ignored if the destination address is not an IP multicast
address.)  If no \-t flag is given, the value of the X resource
.I Vat.defaultTTL
is used.  
.TP
.B \-U
Use the unix-domain stream socket specified by
.I socket
for audio I/O.  Some process should bind to and listen on
this socket before vat is run.  The data is raw 8khz mulaw samples.
If 
.I socket
is a number, then AudioFile is used.  The number indicates
the corresponding AudioFile device.
.TP
.B \-u
Use
.I script
as the tcl program to build the user interface, rather than
the built-in script.  You can also override vat's built-in
tcl script with your own tcl code in $HOME/.vat.tcl.
.TP
.B \-v
Use a packet format and default multicast address that is compatible
with ISI's
.I vt
(voice teleconferencing tool).  
.I Vat
and
.I vt
can interoperate but
.I vt
does not participate in
.I vat's
`session' protocol so the names of conference participants running
.I vt
will not show up in the
.I vat
`Conference Hosts' window (IP addresses for
.I vt
participants appear instead).  
.sp .5
In vt compatibility mode, the default destination host or multicast
group is controlled by the X resource
.I Vat.defaultVTHost
rather then
.I Vat.defaultHost.  
The
.B \-v
flag can be defaulted by setting the X resource
.I VTMode
to true.  
.TP
.SH "VAT Operation"
Note:  In addition to invoking the ``quit'' button, typing `q', `Q', ctrl-C or
ctrl-D anywhere in the window will terminate
.I vat.
.LP
The
.I vat
window is divided into two parts: the right has controls for the local
audio and the left has a status display of the hosts participating
in the current conference.  The audio controls consist of two sliders
that control the mike and playback gain, a button to toggle output
between the built in speaker and the headphone jack, buttons to
mute/unmute either the mike or speaker, and buttons to control
acquistion of the audio hardware.  To adjust the
sliders, either click at the desired position or click and drag
the slide button to the desired position.  Just to the left of each
slider is a VU meter.  A rule of thumb is to adjust the mike and
speaker gain sliders so the peak readings on the meter are about 80%
of full scale.  
.LP
To change the audio output line (i.e., speaker, headphone, lineout, etc.)
click on the speaker icon (it should change to a headphone icon).
Additional clicks will round-robin among the available lines.
If there is only one option, the button will be disabled.
Similarly, click on the mike icon to select among the input lines.
To toggle
mike/speaker muting, click on the appropriate
.B mute
button.  The button will be highlighted when muting.
.LP
The
.B "Conference Hosts"
window lists every site
currently participating in the conference, with your site always listed
first.  The participant name
is displayed in a box that is highlighted whenever that site is
speaking and grayed-out if a `session' message message from the
site hasn't been received for at least 30 seconds (each vat sends
`session' message every 6 seconds) \(em this usually indicates that
the site has lost connectivity or that vat has been aborted or
stopped.  There is a checkbox to the left of each participant name.  
Clicking on the box will cause audio from that participant to
be discarded instead of played (for example, this might be used to
suppress a site that is generating echoes).  
Names, by default, appear in the form
.I user@host,
but can be arbitrarily specified by each participant (i.e., with \-N).  
.SH "Multiple VAT Windows"
One host can be running an arbitrary number of vat sessions (presumably
with different destination addresses).  However, since most workstations
have only one set of audio hardware, only one of those sessions
will be able to access the mike and speaker.
For the most part, the vat sessions will automatically follow the action.
If you unmute the mike or press the ``Keep'' button, the audio
device will be acquired by that session and the session that
previously held the audio will relinquish it.  Vat displays
it's title bar in an oblique font when the audio is not being held.
.LP
A vat session will also acquire the audio if there is input
from the network.  But to prevent a background vat session
from stealing the audio from the foreground session, you
can toggle the ``Keep Audio'' button.  When the ``Keep Audio''
button is highlighted, vat will reliquish the audio only if
there is a user demand in another window (i.e., unmuting the mike
or selecting the ``Keep Audio'' button).
.LP
Participants in a multi-site conference often want to have
`side conversations' that don't bother the rest of the conference
participants.  
Vat has some support for establishing side conversations:
If you middle-click on the name of some
site in the conference hosts window, a new vat window will be created
that talks only to that participant (it sends unicast datagrams rather
than multicast).  If that other participant also middle-clicks on your
site, you can have a private conversation between just your two sites
using the newly created vat windows.  Note:  due to a `bug' in
the way most systems implement multicast, if you create a new window
aimed at a particular participant but they haven't created a window
aimed at you, they will hear you speaking in the main conference window
and may not realize that your audio is being sent only to them and
not multicast.  One can view this either as a feature (it provides
a semi-private channel you can use to ask someone to set up a side
conversation) or a bug (it often leads to strange, one-sided
conversations where one side multicasts and the other doesn't).  
.LP
Note that a site that's running as a `mixer' (\-m flag) for some
other site should
.I not
run multiple vat windows:  The audio hardware is needed to provide
the timing for the mixing function and the window doing the mix
will cease to function if it loses control of the audio hardware.  
.SH "Auxiliary Controls"
Clicking on the ``Menu'' label at the bottom of the
.I vat
window will cause a panel of
.B "auxiliary controls"
to open.  
.LP
The
.B "Audio Tests"
buttons will enable some audio test modes.  These should
.I not
be selected during a conference.  
The
.B "loopback mike"
button will cause input from the mike to be sent to the local
speaker/jack.  This might be useful for checking levels and
debugging cable problems but the 20ms delay from input to output
makes talking in this mode almost impossible.  The three tone
buttons will generate one of three reference tones through the local
speaker.  Level setting should generally be done with the -6dBm
tone.  
.LP
The
.B "Output Mode"
buttons control what vat will do to avoid feedback/echo from the
mike to the speaker.  In
.B mike mutes net
mode, vat will mute the speaker whenever it thinks that you are talking,
while in
.B net mutes mike
mode, vat will mute the mike whenever input from the network arrives.
In
.B full duplex
mode, vat will assume that feedback can't happen and do nothing to avoid it.  
In
.B echo cancel
mode, vat will attempt to eliminate echoes by doing some fancy signal
processing.  (EchoCancel requires the BSD sound driver \(em it is
disabled when running vat under Sun OS because the Sun driver does not
provide any
mechanism to time correlate audio output and input.)  The internal
speaker should only be used in `speakerphone' or `echo cancel' mode \(em
selecting `headphone' mode for it will result in your site injecting
a lot of unpleasant echoes into the conference.  The headphone jack
should be set to `FullDuplex' mode if you have headphones plugged into
it and `MikeMutesNet' or `EchoCancel' mode if you have an external
amp and speaker plugged into it.  
.LP
There are two
.I "type-in boxes"
(see below) at the bottom of the Auxiliary Controls panel.  The one labeled
.B `Name'
can be used to change the session name announced to other sites.  
The one labelled
.B `Key'
can be used to specify an encryption key (see next section).
.SH Encryption
Since vat conversations are typically conducted over open IP networks
there is no way to prevent eavesdropping, particularly for multicast
conferences.  To add some measure of privacy, vat allows the audio
packet streams to be DES encrypted.  Presumably only sites sharing
the same key (and, of course, the NSA) will be able to decrypt and
listen to the encrypted audio.
.LP
Encryption is enabled by entering an arbitrary string in the
.B key
box (this string is the previously agreed upon encryption key
for the conference \- note that key distribution should be
done by mechanisms totally separate from vat).  Encryption
can be turned off by entering a null string (just a carriage
return or any string starting with a blank) in the
.B key
box.
.SH "X Resources"
The following are the names and default values of X resources used by
.I vat.
This list is incomplete.  Consult the tcl code in vat.tcl for
the complete set.
.LP
.RS
.nf
.ta \w'vat*controls*title*font:  'u
Vat.defaultHost:	224.2.0.1
Vat.defaultVTHost:	224.1.0.200
Vat.defaultPort:	3456
Vat.defaultTTL:	127
Vat.audioFormat:	pcm
Vat.vtMode:	false
Vat.lectureMode:	false
Vat.inputPort:	Mike
Vat.outputPort:	Speaker
Vat.speakerMode:	Speakerphone
Vat.jackMode:	Headphone
Vat.mikeGain:	32
Vat.speakerGain:	180
Vat.jackGain:	180
Vat.mikeAGC:	false
Vat.mikeAGCLevel:	32
Vat.speakerAGC:	false
Vat.speakerAGCLevel:	32
Vat.maxPlayout:	6
Vat.defaultPriority:	100
Vat.idleHoldTime:	10
Vat.idleDropTime:	0
Vat.autoRaise:	true
Vat.pushToTalk:	false
Vat.keepSites:	false
Vat.key:	
Vat.muteNewSites:	false
Vat.siteDropTime:	30
! fonts
Vat.titleFont:	-*-helvetica-bold-r-normal--*-140-75-75-*-*-*-*
Vat.audioFont:	-*-helvetica-medium-r-normal--*-100-75-75-*-*-*-*
Vat.helpFont:	-*-times-medium-r-normal--*-140-75-75-*-*-*-*
Vat.ctrlFont:	-*-helvetica-bold-r-normal--*-100-75-75-*-*-*-*
Vat.ctrlTitleFont:	-*-helvetica-bold-r-normal--*-120-75-75-*-*-*-*
Vat.entryFont:	-*-helvetica-medium-r-normal--*-100-75-75-*-*-*-*
.fi
.DT
.RE
.LP
.B Vat.maxPlayout
is the maximum `play out' delay, in seconds, that can be tolerated.  I.e.,
.I vat
dynamically adapts to delays introduced in the network by delaying
the play out of a remote site's audio packets.  The range of adaptation
is limited by the size of a buffer in vat and this parameter essentially
sets the size of that buffer.  Setting
.B maxPlayout
larger than 10 seconds will probably result poor vat and system behavior
due to excessive paging activity.  
.LP
Vat has two different modes of adapting the playout delay, one more
suitable for an interactive, multi-party discussion or conference and
the other more suitable for listening to a speech or lecture.  
The two modes differ in how quickly they `forget' the delay vat
introduces to adapt to transient network congestion:
In
.I "Conference mode"
vat attempts to minimize the delay (since large
delays make interactive conversations difficult) but this usually
results in more lost packets when the delay becomes too short handle
the next congestion event.  In
.I "Lecture mode"
vat attempts to minimize lost packets by reducing delays very slowly.  
This results in the clearest playback but interactivity may suffer.  
.LP
.I "Conference mode"
is the default when vat starts up unless the
.B \-l
flag is given or the X resource
.I lectureMode
is set to true.  There are radio buttons in the network section of
the Auxiliary Controls panel to switch between Conference and Lecture
modes.  The switch can be made at any time and takes effect immediately.  
.SH "Type-in Boxes"
Since there's no standard for X keyboard input we had to roll our own
and vat's type-in boxes need some explanation:  To enter something new in
a type-in box, position the mouse in the box and click to insert
a cursor.
To modify the existing contents of the box, click with the
left mouse button where you want to append (or click-and-drag if you
want to replace) then start to type.  If you hit carriage return,
your new input will be acted on.  If you hit escape or bell (^G)
your new input will be discarded.
.LP
If the string is too long for all of it to be visible, the right
mouse button can be used to drag invisible portions into view.
Hold down the button and drag the text the direction you want it
to move.
.LP
While typing, a tiny subset of emacs-like line editing commands
are available: ^H or DEL (delete previous character),
^D (delete character), ^F (forward char), ^B (backward char),
^A (beginning of line), ^E (end of line), ^W (delete previous word),
^U (delete everything).
.SH Statistics
Clicking on a name with the left mouse button will bring up a
window with packet statistics for that site.  The window will
remain mapped as long as the button is depressed.  There are 
three of numbers.  The last column is the aggregate statistics
since vat started, and the first column is the difference between
the last update time and the current time.
The middle column is a smoothed average of the first column
with units depend on the statistic.  The smoothing filter
is a exponentially weighted average with gain Vat.statTimeConst.
All of the averages (except total packets) are displayed as a
percent of number of packets sent (which is actually an estimate
based on the number of packets received plus those estimated
to be missing).

The statistics are
updated every second or so while the window is mapped.
If you hold
down the shift-key when you bring up the statistics window, it
will be mapped permamently and will include stripchart.
The stripchart plots the selected statistic against time.

Altenatively, the stats for all the sites can be sent to standard
out by typeing 's' or 'S' anywhere in the vat window.
The first entry is the total packets received from the site since
starting vat and since the last time you requested stats.  
E.g., `3515/62' means 3515 packets have been received from the site
and 62 received since the last time you printed stats.  
The next four entries are various error counters.  Each consists of
two numbers separated by a slash.  The first number is a count of
how many times the error occurred since the last time you requested
stats.  The second is the same info expressed as percentage of
packets received.  E.g., `9/3.0' in the `drop' column would mean
9 packets or 3% of the packets received since the last stats
output had been dropped.  
The
.I `lost'
column counts packets that didn't make it (this is computed
by keeping track of gaps in the packet sequence number space).  
The
.I `drop'
column counts packets that arrived but were dropped because
the current playout delay was too short to resequence the
packet.  
The
.I `dup'
column counts duplicate packets that were discarded
(you shouldn't see any of these
unless the packets had the misfortune to traverse the T1 NSFNet
backbone -- the worthless IBM token ring hardware in each NSS
loves to duplicate packets).  
The
.I `order'
column counts packets that arrived out of order (this is not
an error but it does indicate routing strangeness and will always
increase the playout delay).  Following the error stats is the
current playout delay for the site and the site name.  
.SH "SEE ALSO"
audio(4)
.SH AUTHORS
Van Jacobson (van@ee.lbl.gov) and
Steven McCanne (mccanne@ee.lbl.gov), both of
Lawrence Berkeley Laboratory, University of California, Berkeley, CA.  
.sp .5
Jack Jansen (Jack.Jansen@cwi.nl) of 
Stichting Mathematisch Centrum, Amsterdam, the Netherlands,
contributed the Intel DVI ADPCM codec.  
.sp .5
Ron Frederick (frederic@parc.xerox.com) of Xerox PARC, Palo Alto, CA,
contributed the LPC codec
which is based on an implementation done by Ron Zuckerman
(ronzu@isu.comm.mot.com) of Motorola which was posted to the Usenet
group comp.dsp on 26 June 1992.  
.sp .5
Carsten Bormann (cabo@cs.tu-berlin.de) and
Jutta Degener (jutta@cs.tu-berlin.de)
of the Communications and Operating Systems Research Group (KBS) at the
Technische Universitaet Berlin
contributed the GSM codec.
.sp .5
Steve Casner (casner@isi.edu) of ISI,
Los Angeles, CA, and Steve Deering (deering@parc.xerox.com) of
Xerox PARC have invested tremendous effort in
making vat work on a scale far beyond the authors' wildest
expectations and have contributed greatly to vat's development,
both directly (via careful analysis of bugs and useful suggestions)
and indirectly (via setting up several global conferences that
severely pushed the envelope of vat's capabilities).  
.SH BUGS
Speakerphone mode is difficult to get right \(em use a headset if you
can (or run BSD instead of Sun-OS to get a kernel audio driver that
can support echo cancellation).  If you have to use speakerphone mode,
try to position the mike as far as possible from the speaker (the
speaker in a sparcstation is on bottom of the machine in the front
right corner near the LED).  If there's a problem with echo (i.e.,
you transmit whenever other people start speaking), try reducing the mike
gain or mute the mike when you're not speaking.  
.LP
In speakerphone mode
.I vat
assumes that if there is audio data from the net being sent to the
speaker at least part of the signal from the mike is pickup from
the speaker.  So, unless the mike signal is `large' compared to the
signal from the net,
.I vat
assumes it is echo and suppresses it.  This means that if you want
to interrupt someone who is talking, you may have to talk a bit
louder than usual at the start (you can tell if you succeed because
your site's name box will light and the speaker will mute).  
.LP
The internal architecture of
.I vat
allows for almost any audio encoding and packet format.  
We are working on U.S. Federal Standard 1016 4800bps CELP
(a very low bandwidth codec that could be run over 9600baud
dial-up SLIP links) and hope to have it in the official release.  
We'd be very interested if anyone would like to contribute additional audio
encode/decode routines.
.LP
Vat's LPC coder is low quality.  We plan to incorporate
the U.S. Federal Standard 1015 LPC-10 2400bps algorithm.
