Audio/Video Device and Channel Switching/Control Software Servers. Maintains a virtual office model, including the concept of door-state for personal accessibility. Provides call management for point-to-point and (internal) multipoint A/V synchronous connections. Controls devices such as a PictTel Codec, various brands of baseband A/V switches, and VCRs. It is the A/V analog of the telephone PBX. TAS/IIIF servers at different sites will be able to communicate with each other in order that connections can be made between users at different sites in a reasonably transparent manner.
Servers are written in C and run on Sun UNIX platforms, SunOs 4.1.3. The software is quite sophisticated.
Fairly unique as a capability, and extensible through new applications (such as Portholes) which exploit the switching features.
Needs client programs that exploit its intelligence (there are some, see below). Software is part owned by Xerox. Servers are single threaded, thus they probably could not handle large sites. Fairly robust but could be improved. Uses Internet services for communications.
The inter-TAS/IIIF communications is currently under development.
These come in several forms: original HYPERCARD client for MACs; new SUPERCARD client for MACs; xcave client for Suns using X, Dom's quickie X-client for any System V compatible X /UNIX system; the CLI (command line interface) client for simple terminal access on UNIX systems.
Various languages depending on the client; HYPERCARD, SUPERCARD, C, C++.
You need something to run TAS/IIIF - really these can always undergo another revision.
None of these are very robust. The Mac clients are slow because they are interpreted. xcave is quite complex because of its additional support for active office sensing. The biggest problem is that each handles interactions with the servers in slightly different ways. We are working on a solution that will eliminate much of this, but it is just in the works.
Answers incoming video Codec calls; plays an outgoing message; records an incoming message until the caller hangs up the Codec. Can be used with one or two VCRs (one for outgoing, one for incoming, or one for both).
This is unique.
Uses D-Boxes (see below) for VCR control. Written in C. Demonstrated on PC-DOS (but locks the whole machine up), and Sun UNIX using the our Codec server. It can co-exist with IIIF on a Sun.
Experimental (but pretty simple) software.
An inexpensive RS-232 based controller for Sony Control-L protocol. One D-box can control 2 or 4 VCRs simultaneously.
Device built by Dave Dunfield. Software is C. But, stored in an EEPROM, so end users do not see any software, just a device.
Much cheaper per controlled VCR than the Sony V-Box.
Only supports control-L, but with the development kit, could be extended to other protocols. Not a mass produced product at this time.
Provides generalized telco/baseband audio/computer accessible acquisition, storage, manipulation, and replay of digital voice data. Voice data can be edited in a "tape recorder" like manner. This serves as a general infrastructure for digital voice based applications, such as voice mail (see below).
PC/SCO UNIX, Dialogics telco interface cards, a baseband audio/telco bridge and other computer peripherals. Can also be run without the telco interface on any System V compatible UNIX platform.
Fairly unique in its flexibility. Very extensible for new applicationsinvolving digital voice.
Experimental component; still under development, but it is demonstrable through the voice mail application (see below). Audio interface not there yet, and will be a custom solution, although Dialogics provides a rather expensive alternative.
An application using the voice server technology to capture, transfer, and playback voice mail from a telephone interface, baseband audio interface, or computer workstations with sound capability (PC with SoundBlaster, MAC, Sun SPARC). It (will) support several interfaces: (1) a voice automated attendant for the telephone - the usual type of interface for leaving and picking up messages; (2) a custom X-client (with equivalents on MACs, PCs) for picking up, sending, forwarding etc., voice mail; (3) a version of ELM (a public domain email interface with MIME capability) that will permit similar functions to (2). Voice data is stored and transmitted using ADPCM, and conversion routines are supplied to handle the different audio formats of the signal processing chips in the various computers.
In its most general for it uses the full telco capable Voice Server, and the rest is in C.
Quite unique in its universality, and bridging of the computer and telco domains. Using the non-telco Voice Server, can support voice mail delivery at virtually any site, including those that normally restrict access to networks for security because it uses conventional email to ship the messages between sites.
To get the telco access you need a telco-capable Voice Server. Furthermore, to have telco pickup access you must use the custom client interface (for computer access of mail) as currently the telco pickup and ELM pickup are mutually exclusive (the application knows which interface - custom or ELM) is to be used for each user. It uses Internet mail for message transfer between ELM interface and Voice Server, and between Voice Servers at different sites. The custom client does not use mail to communicate with the Voice Server. And of course it uses Internet services for other communications. It uses Internet email addresses to address voice mail (where appropriate)
Sends, receives short (2 minutes or less) video messages (including audio) between users in a local site. It consists of a custom MAC client, and a VMail server that stores messages in analog (VCRs). It uses the TAS/IIIF servers as a means of routing the baseband A/V to the video storage device.
The MAC client is in SUPERCARD and the VMail software is in C. It needs one or more VCRs and D-boxes to control the VCRs, and of course the TAS/IIIF infrastructure.
Fairly unique application.
Just an experiment right now, and under refinement. Only works in a TAS/IIIF site although this dependence is not really substantial. Video mail is not shipped out of a local site as yet. It may be a little slow because of the analog storage.
We will be replacing the analog server with a fully digital video server. This should improve performance, but will need a dedicated machine. We will be exploring the handling of cross site mail, the integration of the video answering machine as an interface to video mail, and integration with the custom voice mail client.
This is a facility currently under design that will provide capture, storage, retrieval of analog video in a digital form. It is a video analog to the voice server, and in fact its process architecture and extensibility are to be the same. The inputs and outputs will be NTSC baseband video and possibly audio. As the design is not as yet finalized, we may capture audio and video together, otherwise the voice server can be used to handle audio with a suitable data structure to maintain synchronization between the audio and video segments. The initial application will be video mail (replacing the analog devices).
PC/SCO UNIX computer with specialized video processing cards (and/or possible some combination of DOS chassis that communicate with the UNIX system), storage peripherals, coded in C and using Internet communications to communicate with other machines.
Very flexible as a platform for different applications, and should be reasonably inexpensive. There are not many digital video servers available commercially.
Still under design, so it is still in the idea stage. The end result will be experimental.
An application for background awareness developed by Xerox. Using the TAS/IIIF servers, it periodically (say once/minute) routes a user's camera to a frame grabber and "takes a snapshot" usually kept in low resolution. Periodically, subscribers to the portholes facility are sent an updated composite of snapshots (say once per 5 minutes) so that they are aware of who is in their office to give a sense of community. Pictures can be sent to subscribers at different sites, so that distributed awareness is possible.
Uses TAS/IIIF, uses a frame grabber in a Sun, and the Internet for transfer of images between sites. Written in C. Uses an X-client for the user interface.
This is a unique capability.
It is owned by Xerox. We are studying its usefulness and will have it operating between the Toronto and Ottawa Telepresence sites, and possible other sites. We will make modifications to improve its utility, if we decide that the idea has sufficient merit. It is experimental in nature.
From Xerox, a real-time, digital video over the Internet application. It will provide a live video connection using a video capture card with a video in a window feature.
Runs on a Sun over the Internet using a specific video card.
One of a few instances of this type of technology, it eliminates the need for baseband video technology and is moving in the direction of high speed networks.
It will saturate most conventional LANs, or a busy fiber LAN. We are studying the usefulness of this approach. It is experimental.
An inexpensive facility for integrating live or recorded baseband video with a conventional graphics (drawing) tool so that the video can be annotated using the graphics tool. The composite image is then generated as baseband video. The net result is that the video can be "marked up" in a manner typical of sports commentary.
Demonstrated on a PC/SCO UNIX machine using X-windows, and video in a window card, and a VGA-to-NTSC converter. A conventional X-based graphics tool was used. The video in a window card places the video feed in the graphics tool drawing canvas, and the VGA-to-NTSC takes the image on the PC screen and generates an NTSC result.
An inexpensive but very useful tool when combined with a similar facility at another site; e.g., the two of them connected by a video Codec. For example, it gives simple shared drawing (but not with truly shared objects), and ability to combine computer output (e.g., programs displayed on a screen or anything else), with a video feed (possibly live, e.g., from a site that needs remote troubleshooting, or from a VCR). The output could be sent to a monitor or just as easily recorded.
It is just a demonstration, but it would be easy to build one from commercially available components.
A system to enable the control of interconnections among A/V components in a room. Consists of a computer controlled A/V switch and software. The software has four layers, each for a different kind of use and user. For the normal (novice) user, there is a list of presets that can be selected. Next, there is a "console" that lets the user make individual connections. For the more expert user, there is an interface that permits the definition and editing of presets. Finally, for the administrator, there is an interface for specifying the configuration of the components with respect to the switch.
A key part of the system is that it has some built-in intelligence. The software can be made aware of the status of the components (such as a VCR) that are connected to the switch. Thus, if one inserts a videotape into a VCR and hits the "Play" button, the system will automatically connect the VCR to a monitor. Likewise, in the middle of a videoconference, if one hits the "play" button, it will reconfigure the system so that the tape is visible to both sides of the conference. On the other hand, if one hits "record," then both sides of the meeting will automatically be recorded. All of this removes a level of complexity from the end user.
A very experimental version has been implemented in SUPERCARD on a MAC. It controls VCRs through D-Boxes, an A/V cross bar switch, and a picture-in-picture (PIP) unit.
This is a fairly unique concept, and in its general form, will be a significant commercial opportunity.
This is a very shaky experimental implementation. We are studying the concepts, and user interfaces. The current user interface needs work, and the software was assembled quickly but not efficiently. It is not robust nor portable.
We may elect to build a second generation DAN using a simple version of TAS/IIIF that will, when combined with the network concepts described later, yield the general solution. In the meantime, different types of user interfaces are being studied by our UI researchers.
A generalized network architecture for telepresence applications that collaborate between different "sites"; a site could be based around a TAS/IIIF server, or a DAN, or similar entities. Using MCS (multipoint communications system) as the communications substrate, the architecture would allow for the incremental addition of new applications, and resources that would be shared among sites. It would provide a good facility for remote control of resources, information sharing, a distributed virtual office model, multipoint conference management, etc.
The prototype will be in C or C++ on a Sun using UNIX (or a PC/SCO UNIX, or both), and use a version of MCS that is UNIX compatible and then runs over TCP-IP as its underlying communications layer.
Will be a unique and extremely flexible facility.
This is just in the design stage. We also need to create the MCS port to UNIX.
An electronic drafting table. The surface is a translucent 2 foot x 3 foot screen onto which a computer display is rear-projected. The screen is actually a digitizing tablet that can sense the position of a stylus on its surface. Hence, a large surface pen-based work surface is provided. The desk lends itself to stand-alone graphics applications, such as CAD, as well as to collaborative distributed applications, such as Vis-a-Vis.
The image of a Macintosh screen is rear projected on a semi-transparent Scriptel Digitizing table. The Scriptel Digitizer tracks the location of an electronic pen on its surface and reports its location to the Macintosh. Rear projection is provided by an overhead projector which projects an image of the Macintosh screen from an LCD Projection Panel. The left hand is tracked by an video camera above the desk. Image analysis software running on a Sun Sparc station locates the hand and relays it location to the Macintosh.
The working surface has a much closer fit than existing systems to both how work is traditionally done (therefore building on existing skills), and how work will be done in the future, when large flat-panel displays become widely available. The unit will support existing applications, and therefore has as much value as an "attention grabber," such as at trade shows, as it does as a product. Works with a variety of computers, such as PC, Macintosh, etc.
The units are bulky and expensive. Also, there is already a commercial product developed independently (but not actively marketed) that may have some protection of some of the design ideas.
A working prototype exists. Work is ongoing at improving the pen driver, and trying to get a better stylus from the manufacturer. The largest hope is for the unit's ability to support new, more "natural" modes of interaction that employ the types of gestures that one typically encounters on drafting tables and whiteboards. This work is ongoing, and we have a serious lead on these techniques.
A technique for supporting multiparty videoconferencing. Has shown to be far superior to conventional voice-activated switching (VAS). Is a hybrid between VAS and Portholes -like functionality. By default, like with VAS, the current speaker appears in the main motion video display. As well, all other participants are visible in slow-scan images, as with Portholes. Participants can change their view of who they focus their attention upon by simply pointing at their Portholes image. In addition, users are aware of who is gazing at them through feedback provided on their display. A mechanism for making private aside comments between participants is also provided.
This was written in C and runs on the Macintosh. Along with using the standard iiif infrastructure it also uses an AKAI switch, a synthesizer, a Pitchrider and a noisegate in order to interpret who is speaking and control the devices in the appropriate way.
Coupled with a product like Vis-a-Vis, this provides the most flexible and powerful paradigm for multiparty desktop meetings. The technique effectively leverages existing technologies, and could be developed on top of existing products now. The idea is original to Telepresence.
There are few, other than the original market will require a certain base infrastructure (LAN connectivity and VAS multipoint unit), therefore limiting the initial market size. Will require significant development work to productize. However, working prototype has been demonstrated.
Prototype has been built and tested. Concept has been proven effective.
A mechanism that permits the state of your physical door (open, ajar, closed) to be sensed and used to control your electronic accessibility. The notion is that the same mechanism that controls access for physical visitations does the same for electronic ones (telephone or video). Simple techniques are provided to over-ride the default setting. The point is to provide a cheap and transparent mechanism to divert phone calls and other interruptions when in a private meeting, for example.
Currently this has been prototyped by securing a mouse to the hinge of a door. The mouse will send back three door states: open, ajar, and closed. The software was developed on the Macintosh in Supercard, although some XFNC written in C were also written.
Simple concept. Has market outside of computers and video. Could be simple door-activates switch that diverts phone to answering machine from handset, and back again.
To be mature, the unit will have to know a lot more than simply the status of the door and control a switch. A real issue has emerged from our practical experience with desktop-video: the problem of, say, telephone interruptions during a meeting are further compounded when desktop video is introduced into the equation. For example, one has a telephone call in the middle of a videoconference, or vice versa, or a video call and telephone call during a face to face meeting. As the number of channels of entry into the office increases, mechanisms to control access will become increasingly important, but more complex.
The concept is defined. The iiif software understands the concept of door-state. We have a working prototype of an inexpensively instrumented door controlling video access. The current version does not deal with the case of the telephone.
A tool for annotating, indexing and analyzing videotapes ad other time-based data. Small and compact. Runs on a Macintosh computer, including Powerbook. What is typed is time-stamped, using the same time-base as that of the video being recorded/played. These typed annotations can then be used in analysis, indexing, or annotating the tape.
As was noted in the description, Timelines runs on a Macintosh and was coded using LISP. The software ÔspeaksÍ to all control-L devices.
The system works and is usable. It is built upon commercially available components.
There are few. It would have to be rewritten in more robust code, but it isready for transfer.
The manually operated version is in late beta testing. Documentation is inadequate at the moment. Work is ongoing.
A key concept growing out of the Telepresence project is the notion of a "video surrogate." A video surrogate occupies the same physical place that would otherwise be occupied by the remote participant if that participant was physically present. The benefit is that relationships between physical location and social functions and social distance are maintained. For example, in a conference room, there would be a video surrogate at the front which would be used or remote presentations, but there would also be a surrogate at the back, around the table, to be "occupied" by a remote auditor/attendee. The next examples are specific instantiations growing out of this notion of video surrogate.
In a four-way meeting, each participant would see each of the 3 remote particiants in a separate video surrogate. The geometry of the round table would be preserved. Because each person is represented by a surrogate, things like gaze awareness and the ability to redirect one's gaze are supported.
Besides their application in multiparty meetings, the physical design of the Hydra units are also well suited for desktop video and diadic desktop videoconferencing. They provide an alternative to having the video appear in a window on the computer screen, where there is contention for screen real-estate. Becase of the tall-narrow form factor of the Hydra units, the video image appears adjacent to the computer monitor, but it does not interfere. And, because of the small foot-print, one pays a minimal price for having an additional display.
These are small desk-top video surrogates. Each unit consists of an LCD monitor, CCD camera and speaker. They are about 11 inches high, and have a footprint of about 3" x3". They were originally developed to support "around the table" meetings.
For small group meetings, the Hydra units have proven a very effective way to support multiparty meetings. They are cost effective, and have proven useful even in point-to-point conferencing.
The underlying infrastucture to support Hydra-like multiparty meetings is not really economically or technologically feasible at this point.
Prototypes have been built and tested. The units could be replicated and developed commercialy.
Two video surrogates are used in each office. One is on the desktop ("near") and one by the door (far). Each consists of a camera, monitor and speaker. People entering or glancing into an office appear by the door. As with physically present visitors, they only enter when invited. Having these two surrogates means that one is better able to preserve conventional social mores concerning social distance, approach, and departure.
The near/far camera uses are controlled by the Desk Area Network (DAN). The a/v connections are done using the standard iiif server protocol and the DAN routes the incoming and outgoing feed to the appropriate near/far devices.
The appoach is simple to deploy and results in a significant change from conventional desk-top video.
There are few, other than logistical issues of switching and deploying the additional gear. Also, the technique is not easily adapted when the video is digital rather than analogue.
Video conference rooms are equipped with a main camera monitor and speaker at the front, but also one or more video surrogates (camera/monitor/speaker ) at the back. The units at the back are for remote attendees to the meeting, where as those at the front are for the presentor. If people change roles - sometimes presenting and sometimes in the audience - then, as with physically present paricipants, they move between the front and their place around the table.
In order to perform the Front-2-Back-2-Front conferencing the a/v inputs and outputs are controlled by the Desk Area Network (DAN). The a/v connections are done using the standard iiif server protocol and the DAN routes the incoming and outgoing feed to the appropriate devices depending on what role the video surrogate is in (presenter or audience).
This means that conference rooms are far more flexible. No more will there be a remote attendee sitting at the front behind the back of the local presenter. Also, this layout provides a new management tool. That is, a manager, or other would-be-participant, who cannot attend a meeting, can virtually "sit at the back" of the meeting, while physically remaining in their office, thereby being able to audit the meeting without having to physically attend. In addition, the back of room camera provides an infrastructure for recording the meeting/presentation, so that those unable to attend can do so at a later time.
Few. Simply requires a way to reconfigure the room. This can be done easily with the Desk Area Network, discussed above.
Implemented and tested. Currently out of commission while the conference room is being reconfigured.
A methodology to be used for:
The method provides a series of modules that cover all aspects of the adoption process. It is specifically designed to be responsive to the needs of the participating organisation - the modular format encourages custom tailoring of the method within each organisation.
A number of techniques are used to :
There are three basic stages in the process of introducing new technology into an organisation - planning, installation or deployment and evaluation. Associated with each stage are a number of issues, the relevance of which must be understood in the context of the innovating organisation. The approach developed by the Telepresence Project provides:
Given the expense of new systems and the high cost to the organisation, both financial and in terms of morale, of failure to adopt, it is crucial to evaluate organisational readiness. The crucial issues at this stage focus on this factor. We recognize that innovation frequently fails for social, and not technical reasons.
At this stage we look at the workplace and the organisation from a variety of perspectives - e.g., work practices, structure, technology. Modules that may be relevant here include:
The results from this stage will condition the remainder of the process.
If a decision is made to proceed we move to the First Transition Stage:
In this period it is essential to get the future users of the technology involved. The modules developed for this stage focus on encouraging the future users to become stakeholders in the adoption process. The modules at this stage include:
These modules are tailormade to address the issues identified in the Planning Stage.
The user is the focus of this stage, both at the individual level and through the Users Group. The modules at this stage include:
This is a relatively short period during which the technical aspects of the system are stabilized and the users gain experience with the system. The Users Group remains the focus at this stage, providing a forum for discussion and sharing of experiences.
The information gathered at the Users Group allows for fine tuning of the system and processes.
In the final stage there is a formal evaluation of the process. Modules include:
The strength of this approach is that it has been developed by a research team, has been tested in the field in real working organisations, and has been used in the introduction of an unfamiliar and highly innovative technology which involves the use of video in the workplace.
The methodology is informed by a philosophy which incorporates and respects the end user, and is grounded in social science theory. The method has been developed by an interdisciplinary team which includes behavioural scientists, sociologists, human computer interface designers and computer scientists.
The methodology is being tested in the field in organisations that are introducing Telepresence systems.
The technology for which this methodology was developed is concerned with the introduction into the workplace of technology which incorporates video. The use of video is unfamiliar in most workplaces and raises a number of issues which can thwart adoption. Thus, the method has been tested under strict conditions
None externally, but internally we need to get copyright protection/intellectual property protection for final documents (e.g., social science instruments, concepts, techniques, etc.)
VANNA is a system for video annotation or "coding" video tapes. It is software that runs on Apple MACINTOSH computers which controls a VCR providing the user with the ability to associate the text or tags with particular segments of the tape. As a result, it allows information storage on videotape as a document.
VANNA is a HyperCard stack which runs on a Mac II. Other hardware requirements include a VCR with Control-L facility, a colour monitor, and Control-L cable.
VANNA was designed to address a problem that many researchers in human interaction experience as a result of using video tape.Video is a very rich recording medium but the process of analyzing video takes longer than the viewing time of the tape. VANNA allows a skilled coder to code without stopping the tape. This drastically reduces the time to code a tape.
On-Track (the new version of VidClip) doesn't seem to work with a mac IIfx. When using System 7 virtual memory must be turned off for use with VidClip or On-Track. Mac IIfx computers have a very different hardware implementation of the serial port. Since the VidClip/OnTrack software writes directly to the hardware it will not work on a Mac IIfx unless you install the FX serial port compatibility control panel. This control panel must be set to compatible (rather than fast).
One of the outputs of the Ontario Teleprsence Project is a stream of innovations which may be relevant of the products and services of the Project's Industrial Partners. These innovations manifest themselves as hardware, software, system prototypes and methodologies.
This document gives a brief description of some of the technologies and methodologies that have been developed internally or adopted from outside sources. For more information on any of these, contact the Managing Director of the Ontario Telepresence Project.