Groupware on the Internet
Diploma Thesis, from
Technical University Graz
Institut für Informationsverarbeitung und
Computergestützte neue Medien
Supervisor: H. Maurer
This is a diploma thesis about groupware and the Internet as a platform for creating, deploying and running applications which support teamwork.
For all kinds of organisations (business companies, universities, etc.) it is becoming increasingly important to have flexible teams and computer-supported teamwork. Teams need to be established across organisational, political and geographical boundaries. Organisations want to integrate and work together with partner organisations and business clients.
Rapid Application Development (RAD) is starting to play an important role for software developers due to the demand for shorter development time of less expensive and more stable software. For more flexibility, the software should run cross-platform. That way the software can not only be used in a clearly defined environment of an Intranet but by millions of Internet users. Software deployment and maintenance have to be made easier.
Groupware needs to provide a familiar context and working environment for members of a distributed team. The software must provide necessary social and task-oriented information to facilitate coordination tasks and to make each member’s work more efficient. However, since groupware also interferes with and changes the subtle and complex social dynamics that are common to teams, these social conflicts, which exist in all human teams, need to be recognised and resolved with the support of the groupware system.
This flexibility and interference with existing social structures poses new challenges for software developers.
Diese Diplomarbeit handelt von Groupware und dem Internet als Plattform zum Programmieren, Verteilen und Ausführen von Applikationen, welche die Arbeit im Team vereinfachen und unterstützen sollen.
Es wird für alle Arten von Organisationen (Firmen, Universitäten, etc.) immer wichtiger, eine flexible Teamorganisation und computerunterstützte Teamarbeit zu haben. Teams arbeiten immer öfter über organisationsbedingte, politische und geographische Grenzen hinweg. Organisationen wollen verstärkt Partnerorganisationen und ihre Kunden in die eigene Struktur integrieren.
Rapid Application Development (RAD) spielt für Softwarehersteller eine größere Rolle, weil die Entwicklungszeit kürzer und die Software billiger wird, aber trotzdem höhere Qualität sichergestellt werden muß. Software sollte plattformunabhängig sein, um flexibler eingesetzt werden zu können. Dadurch kann die Software nicht nur in einer klar definierten und bekannten Umgebung eines Intranets, sondern von allen Benutzern des Internets verwendet werden. Zusätzlich müßten Softwareverteilung und Wartung einfacher und billiger sein.
Eine Aufgabe von Groupware ist es, eine vertraute Arbeitsumgebung für die Leute im Team zu schaffen. Die Software muß die nötigen sozialen und arbeitsbezogenen Informationen liefern, um die Teamkoordination einfacher und die Arbeit effizienter zu gestalten. Dadurch kommt es auch zu einer Beeinflussung und Veränderung der sozialen Struktur in einem Team. Soziale Konflikte, wie sie in allen Teams existieren, müssen mit Hilfe der Groupware erkannt und auch gelöst werden.
Diese notwendige Flexibilität und Beeinflussung der sozialen Struktur stellen eine neue Herausforderung für Softwarehersteller.
Ich möchte mich an dieser Stelle bei meinem Professor und Betreuer dieser Diplomarbeit Hermann Maurer bedanken. Seine Tips haben mir sehr geholfen, diese Diplomarbeit fertigzustellen. Außerdem war er mir behilflich, die Kontakte zur Universität Unimas in Sarawak, Borneo zu knüpfen, damit ich dort mit meiner Diplomarbeit anfangen konnte.
Diese Arbeit widme ich meinen Eltern in Dankbarkeit. Sie haben mich finanziell unterstützt und mich auf meinem Weg bestärkt. Ich möchte mich auch bei meinen Geschwistern und Großeltern sowie allen Verwandten für deren Unterstützung bedanken.
Besonderer Dank gilt auch meiner Freundin Barbara, die in dieser schwierigen Zeit meines Auslandaufenthalts zu mir gehalten hat, und meinen langjährigen Freunden Helmut und Wolfgang.
Groupware on the Internet
I hereby certify that the work reported in this is my own and that work performed by others is appropriately cited.
Signature of Author:
Chapter 1 Introduction and Motivation *
1.1 Introduction *
1.2 Organisation of this thesis *
Chapter 2 The Internet *
2.1 History of the Internet *
2.2 The Internet Collaboration Platform *
2.2.1 Information access*
2.2.2 Communication and collaboration*
2.3 Focus of the Internet *
Chapter 3 Internet Applications *
3.1 First Generation Hypermedia Systems *
3.1.1 World Wide Web*
3.1.2 Problems of 1st Generation Systems*
3.2 2nd Generation Hypermedia System: Towards a Workplace for Collaboration *
3.2.1 Maintenance support*
3.2.2 Structured Hypermedia*
3.2.3 Meta-Information for Objects*
3.2.4 Advanced Links*
3.2.5 Access Control and Logging*
3.2.6 Versioning of documents*
3.2.7 More Precise Information*
3.3 Hyperwave – the first full-scale implementation of a 2nd generation system *
3.3.1 Objects and object attributes*
3.3.3 Navigational Concepts*
3.3.4 Search in Hyperwave*
3.3.5 Document Management*
3.3.6 Access Control*
Chapter 4 Computer Supported Cooperative Work (CSCW) *
4.1 Definition *
4.1.1 CSCW (Computer-Supported Cooperative Work)*
4.1.3 Workgroup Computing*
4.1.4 Workflow Management*
4.2 General Problems with Groupware *
4.2.1 Technical Problems*
4.2.2 Social Problems*
4.3 Different Categorisation of Groupware *
4.3.1 Focus of the Cooperative Activity*
4.3.2 Amount of Structure Involved*
4.3.3 Degree of Embedded Semantics of the Collaborative Task*
4.3.4 Levels of Sharing*
4.3.5 Location of Users*
4.3.6 Time of Collaboration*
Chapter 5 Workgroup Computing *
5.1 Architectures *
5.1.1 Distributed or Client-to-Client*
5.1.2 Selected Client as a Serialisation Point*
5.1.3 Central or Client-Server*
5.1.4 Paradigm for Enabling Large-Scale Group Collaboration*
5.2 Awareness *
5.2.1 Types of Awareness*
5.2.2 Filtering Awareness Information*
5.2.3 Issues and problems*
5.3 System-Controlled Concurrency Control *
5.3.1 Drawbacks of Traditional Database Concurrency Control*
5.3.2 Different Categorisation of Concurrency Control*
5.3.3 Requirements for Dynamism*
5.4 Social Conflict Management *
5.4.1 Information Flow*
5.4.2 System-Supported Social Management*
5.5 Communication *
5.5.1 Parameters for Groupware Communication Protocols*
5.5.2 Reasons for Implementing Communication Facilities*
5.6 Collaboration *
5.6.1 Example: Common Text Editing*
5.7 Lotus NSTP 1.0 *
5.7.1 Basic Conceptual Model*
5.7.2 Support for awareness*
5.7.3 Support for Communication*
5.7.4 Examples for the Usage of NSTP*
5.7.5 Common Text Editor*
Chapter 6 Workflow Management *
6.1 Features of Workflow Systems *
6.2 General Problems with Workflow Systems *
6.3 Generations *
6.3.2 Factored Application*
6.3.3 Tailorable Service (now)*
6.3.4 Embedded Enabler*
6.4 Differentiation of Systems *
6.4.1 Development Methods*
6.4.2 Process Modes*
Chapter 7 Creation of Internet-Based Groupware *
7.1 The WWW as a Platform for Collaboration *
7.2 Requirements for Applications *
7.3 Programming Interfaces *
7.3.1 Location of Software Execution*
7.3.2 Security Issues*
7.4 Programming Languages *
7.5 Component Models *
7.5.1 JavaBeans Framework Model*
7.5.2 ActiveX Framework Model*
7.6 XML (Extensible Markup Language) *
7.6.1 Limitations of HTML*
7.6.2 XML Differs from HTML*
7.6.3 Web Applications with XML*
Chapter 8 Functionality of an Asynchronous Conference System *
8.1 Using the WWW as Platform for Asynchronous Conferencing *
8.2 Functionality of Current Conferencing Systems *
8.3 Problems of Current Conferencing Systems *
8.4 Functionality of a Full-Featured WWW-Based Conference System *
8.4.1 Full support of a rich text format (HTML)*
8.4.2 Searchable Meta Information*
8.5 Hierarchy of Documents *
8.6 List of Keywords (LoK) *
8.6.1 Adding a New Listword*
8.6.2 Moving a Listword in the Hierarchy*
8.6.3 Removing a Listword*
8.7 Searching for Documents *
8.7.1 Browsing the Hierarchy of Listwords*
8.7.3 Similarity of Documents*
8.7.4 Forum Dynamics*
8.8 Maintenance *
8.9 Problems *
Chapter 9 Summary *
Chapter 10 References *
Introduction and Motivation
In the last 20 years, the Internet has evolved into a global marketplace for information. Education was based on "teaching facts" for a very long time. Therefore, it is not surprising that Internet has become important for information dissemination and global accessibility. Also in business people need to cooperate with other people and use business processes to get their work done. Much like the step from the telegraph to the telephone, the step to Internet technology connects people to a richer flow of information.
However, in the last few years interaction and collaboration with teams has become increasingly important. Therefore, the Web is changing from being merely an information source to a platform for applications and more recently to a workplace for global collaboration.
Collaboration is happening in increasingly heterogeneous environments. The trend shows a movement from traditional internal organisational collaboration towards open and global workgroups. This implies that traditional collaboration tools need to move towards open standards to attract customers. It will be shown that the Internet is one of those open standards used by millions of people every day.
The following is a brief outline of the organisation of the thesis.
- Chapter 1 – Introduction
- Chapter 2 – The Internet: The Internet, its history and its ability to serve as a collaborative platform are described in this chapter.
- Chapter 3 – Internet Applications: The most common applications for the every day use of the Internet are explained in that chapter.
- Chapter 4 – Computer Supported Cooperative Work: Important terminology and general problems concerning groupware are explained. Different ways of categorisation are introduced. The difference between workgroup computing and workflow is shown.
- Chapter 5 – Workgroup Computing: Workgroup computing, different architectures and necessary components (awareness, concurrency control, and social conflict management) are explained. The two interaction paradigms communication and collaboration are introduced.
- Chapter 6 – Workflow Management: Features, general problems, different generations of workflow systems and ways to differentiate them are shown.
- Chapter 7 – Creation of Internet-Based Groupware: Tools, programming languages, interfaces and their requirements to make them suitable to create applications for the Internet are explained.
- Chapter 8 – Functionality of an Asynchronous Conference System: The functionality of current discussion forums and necessary improvements are investigated.
- Chapter 9 – Summary
- Chapter 10 – References
L. Kleinrock at MIT published the first paper on packet switching theory in July 1961 and the first book on the subject in 1964. J.C.R. Licklider of MIT discussed his "Galactic Network" concept in August 1962. He envisioned a globally interconnected set of computers through which everyone could quickly access data and programs from any site. In 1965, the first (however small) wide-area computer network was built using a circuit switched telephone system, which was totally inadequate for the job. The need for packet switching was confirmed.
The plan for the "ARPANET" was publishing in 1967. As the ARPANET sites completed implementing NCP (a predecessor protocol of TCP) during the period 1971-1972, the network users were finally able to begin to develop applications. In 1972, the first "hot" application, electronic mail, was introduced.
DARPA supported UC Berkeley in investigating modifications to the Unix operating system, including incorporating TCP/IP. The transition of the ARPANET host protocol from NCP to TCP/IP was completed in 1983 – and the Internet was born.
Taken from [Leiner97]. For further information visit [Chambe97].
- More and more people are permanently connected to the Internet: This makes a shift from asynchronous (e.g. Email) to synchronous (e.g. MS NetMeeting, Netscape Conference) communication and collaboration possible. That way the Internet can be used as a platform for interaction with other people. [McGr97] shows an example of the collaborative potential of bringing so many experts into real-time contact.
- Cross-platform interoperability based on open standards : Especially universities (but also other organisations) with their heterogeneous networks and different platforms need to focus on open standards to keep on being open for computer-based groupwork. There are several standards defined for the Internet that are widely accepted. Creating applications which use these open standards ensures an open door to the big marketplace of the whole Internet with its millions of users.
- Thin clients: The computing power of CPUs is doubling faster and faster. Companies and universities cannot however afford to replace their hardware every year. The Internet is mostly based on client-server architecture. Clients just install the browser and applications can be downloaded on demand. Therefore, the cost-of-ownership will be much lower. Software deployment and maintenance becomes easier.
This increasing popularity has motivated both research and industrial environments to investigate the Internet’s potential for groupwork and collaboration support. Consequently, various WWW tools have been developed whose purpose is to enhance and enable communication and collaboration via WWW.
Some tendencies can be observed:
- Tools based on 1st generation hypermedia systems like Netscape Suitespot add more features to address the need for collaboration and communication support. Email and discussion servers are added to the product line. Browser packages retrieve not only information from Web servers, but also provide support for email and news server. Even tools for synchronous communication and collaboration are integrated.
- Highly integrated and proprietary groupware products like Lotus Notes move towards open standards or provide gateways for communication and data access.
- 2nd generation hypermedia systems like Hyperwave are coming into existence. They already integrate features for collaboration and communication.
- Programming interfaces are added on the client and server side. This will make the media even richer and more interactive.
The evolution has been from Web publishing to implementing Web-based applications, and continues to grow into collaboration and workflow projects.
- Publishing and authoring: Since the addition of the World Wide Web to the Internet, often the first use of this medium is to publish information much like an electronic billboard.
- Web-based applications: Soon schools and universities discovered that they could use the Internet as a platform for applications to deliver more interactivity. This for example was used to create interactive learning software for learning on demand. In addition, companies wanted to provide direct access to their corporate data and applications to consumers. In fact, some companies’ only storefront is on the Web. Building a Web-based user interface to a business application also makes it feasible to quickly and inexpensively extend corporate applications and data to remote employees and business partners. That way the Internet helps to deliver information that is more accurate.
- Workflow and collaboration: The next step is to use the Internet not only as application platform but even as a workplace for distributed teams. Collaboration and communication can occur between co-workers, but also customers, and suppliers. They find ways to provide broader access to business processes and facilitate communication that is more effective. When person-to-person communication can occur quickly, or without the need to travel, costs go down. When manual business processes are automated, often quality increases while the costs decrease.
The World Wide Web (WWW) can be described as an Internet-wide distributed heterogeneous hypermedia information retrieval system.
HTTP is a stateless protocol. The Client/Server connection is only maintained for the duration of one transaction. Every transaction to be initiated by a client establishes connection with the server and closes it when the transaction is complete.
This has one major drawback. Opening a TCP/IP connection is time-consuming and if documents with several pictures need to be downloaded connections have to be established for each object (which is time consuming using TCP/IP). Luckily, modern browsers can perform simultaneous download of several objects.
The life cycle of a connection consists of four parts:
- Connecting of the client to the server’s port
- Request of certain information by the client
- Response of the server to the request
- Closing of the connection
HTML is an SGML-like mark-up language. Tags are used to format the text and include multimedia documents like inline images and hyperlinks. Therefore, hyperlinks are embedded in the document and not extracted by the server. Nor is meta-information explicitly extracted and kept in a separate storage.
URLs consist mainly of three parts:
- The protocol used to access the document, e.g. "http", "https", "gopher", "ftp"
- The IP address of the server hosting the document, e.g. "www.iicm.edu"
- The location and name of the document on the server, e.g. "/myspace/test.html"
Thus the URL not only points to the document, but also provides information about the protocol to use. That way, WWW can be used to access documents on a heterogeneous network using different protocols.
First generation Web servers work fine for small Web sites, like 50-200 pages. However, today’s Web sites are growing. 10 years ago Web sites served for the dissemination of documents. Nowadays, whole dictionaries, electronic books and newspapers are published on the Web. People are starting to develop applications for the Web and interact with the applications and other people. Therefore, information is dynamically changing and more interactive. Automatic maintenance, navigation support and security are becoming more important [Andrew94].
At the beginning, the Web was mainly used for dissemination of text and pictures. There is also a shift towards using multimedia like (streaming) audio and video.
While browsing through the Web, the user just sees one page at a time. There is no overview of the server’s information structure. After following several links, the user often feels "lost".
- Flat storage model: The documents in WWW servers are not structured. The storage model is flat. Hierarchies can just be recognised by examining URLs. However, there are no features like "go to the parent directory". Of course, today’s browsers offer a "Back" button, but what, if there is more than one parent?
- Unidirectional hyperlinks: Navigation is only based on unidirectional hyperlinks. The user sees links pointing from this document to others, but not links pointing to the document. Therefore, it is impossible or very difficult to show a local map to see where the user came from and where he can go.
- No global navigation map : There is no way to find out, in which "part of the information space" of the server one is located at a given moment to get an overview of the hierarchical structure of the Web. It’s much easier for people to memorise hierarchical structures or 3-dimensional maps of the server’s info-space than the spaghetti-bowl links of the Web.
It was mentioned in the last section that links are unidirectional, which makes it difficult to generate local maps. Another point is that links are embedded and in principle ignored by the server. It is up to the user to maintain the Web’s integrity:
- Link maintenance : [Andrew94] Links are stored embedded in the document. Therefore, the Web servers store no information about which links point to a certain document. If a document is deleted, removed, or just renamed, the links pointing to this document become dangling. When the user tries to follow that link, he just gets the famous "404 not found" response. It is up to the operator to maintain the Web’s integrity, which is manageable for 50 to 100 Web pages with the help of 3
Early Web servers were merely used for accessing and downloading documents, not for an information flow in both directions. Hence security was not of much concern. If the Web is also used for disseminating confidential information or running applications on the Intranet or Internet, security and access control becomes an important issue.
- Access control : At the very beginning of WWW, there was no access control. Of course, there is still the access control of the operating system, but the server usually runs under a privileged account and effectively prevents access control that way. This is a problem if the user can make the server start a program (like what happens with CGI scripts). Today’s WWW servers like Netscape’s Enterprise Server 3.0 are starting to address this problem.
- Different views for different people : It could be helpful to have different visibility of objects on the Web site instead of constructing two or more sites for Intra-, Internet or other user groups.
- 2nd Generation Hypermedia System: Towards a Workplace for Collaboration
Now let’s take a more careful look at requirements of 2nd generation hypermedia systems to support the shift from simple information serving to a full-featured workplace. For a comparison of WWW and Hyperwave, see [Pam95].
If one shifts the workplace from the desktop environment to the Web, it means that on the one hand one has to provide access to a huge number of documents like the user’s repositories, libraries and other background information. On the other hand, some of these documents are constantly changing. So there is need for a systematic support for automatic maintenance of the Web’s integrity.
- Link management in a changing environment : Automatic link management [Andrew94] means that the server checks which links point to deleted documents and deletes these as well. If documents are moved, the links pointing to this document should still be valid.
- Abstraction of resource identification : In 1
Today’s Web with its links is often compared to a spaghetti bowl. An intuitive structure like multiple DAGs could help the user to avoid the "Lost in Hyperspace" syndrome and provide a subspace for semantically similar documents.
- Navigational support : like local and global maps of the Web [Andrew94].
- Gathering of semantically similar documents : That’s what most people do now in their local file system. There is one directory for personal data, one for programs, and one for drivers...
- Setting search scope : Instead of searching the whole Web to find something about mushrooms, the search scope could be limited to the Fungi-hierarchy. This would decrease the workload of the server and help avoiding the retrieval of non-related documents, like documents about the Internet mushroom project.
One major problem Internet users are facing nowadays is the retrieval of useful information. The Internet provides access to Terabytes of data, but people spend more and more time searching for it. Meta-information does not just make it easier to retrieve useful information but also helps to manage the Web.
Because the need for machine-usable descriptions of collections of distributed information is increasing rapidly there have been a number of proposals in the recent past that have made significant steps toward this goal, including MCF using XML (see [Guha97]) and PICS (see [Resnick96], [Resnick]).
- Objects can be identified by their properties : Often people know the author or the date of creation of a document. This enables to search more efficiently than performing full text searches.
- Objects can become valid or invalid : Some announcements and other information have just a limited time of validity. After that time the document or part of it can be hidden or automatically removed to avoid outdated information.
- Provides information about an object, before downloading or viewing it : The system can provide information like mime type of the document...
Automatic link management and meta-information for links have already been mentioned. In 2nd generation hypermedia systems links should be open for various kinds of multimedia documents and sources.
- Keeping document integrity : Adding a link to a document in a traditional system means that the document itself has to be changed. So it is impossible to add links to documents without write permission or to read-only sources.
- Extensible media and link support : Today’s Webs serve not only HTML-documents, but also Postscript and PDF files, videos and audio files. Storing separate links makes it easier to link various kinds of file types.
- Bi-directional links : this makes it easier to generate a local map of documents pointing to a file on the fly or showing the parents of a file. Also this feature is necessary to maintain link consistency. If a document is deleted, links pointing to it can easily be found and removed as well.
When the Web is used as an application platform, version control including restoration of older versions and document locking becomes important.
The usage of the Web is getting more and more automated. Agent technologies especially are used to retrieve useful information, or to fill out forms, etc. However, for the application to work efficiently the kind of information needs to be clearly defined. This would make the exchange of information more efficient.
Hyperwave (formerly called Hyper-G) claims to be the first second-generation hypermedia system among Internet web servers. The server is described in [Maurer97] as "distributed database system that is WWW transparent", as a "WWW oriented document management system" or as an "advanced WWW server with integrated database facilities".
The use of Hyperwave for a Web based collaboration of different authors to write a book will be shown in chapter 220.127.116.11.
Hyperwave uses an object-oriented approach to store documents, links, etc. Every object (document, collection, link...) is stored with meta-information for name, title(s) in different languages, keywords, author, date of last modification, etc.
Every object gets a global ID on insertion. Therefore, even if the title, the location, or the name changes, Hyperwave still references the right object. This prevents the disadvantages of using static URLs. Other Hyperwave servers can use these global IDs to point to a remote object on another Hyperwave server.
Indexed meta-information [Kappe97] provides faster access to relevant information and is customisable. Thus the search for information is faster and the search result can be more relevant.
Hyperwave offers the navigational concept of Hyperlinks, like in WWW, and of hierarchies, like in Gopher. Therefore, the server helps the user in getting an overview of the structure of a web site and helps to avoid the "lost in Hyperspace" syndrome.
- Collection hierarchy : Every document being uploaded has to be inserted in at least one collection (or cluster). Therefore, the document is accessible through browsing the hierarchies even if no link is pointing to it. The advantages of this concept are:
- It avoids many navigational links to create and maintain.
- Every document is visible on insertion even if no link is pointing to it, because it is integrated in the hierarchy and can be accessed through it.
- If one document or collection belongs semantically to more than one group, it can be inserted to more than one collection but is still physically stored just once. That way similar objects can be semantically gathered in one sub-graph.
- The collections can be used for defining search scopes. This improves the use of search tools (which is discussed later).
- Hyperlinks are objects like collections or documents with their own meta-information and access rights. They are not stored embedded in the document, but externally. Links consist of a source and destination anchor. The object-oriented approach and the fact that links are stored externally make links more flexible in Hyperwave. Therefore, Hyperwave can even link to Postscript files or frames of movies. The advantages are:
- Links are bi-directional: The user can follow them in both directions. It is easier for the server to create a local map of a document (parents and children of a document).
- Different visibility : Different visibility for different users (access rights). Predefined trails e.g. can be provided for a special group of users.
- Link types : Like any object in Hyperwave, links have meta-information attached. Therefore, different link types can be defined and even be searched for.
Hyperwave helps to find information through enhanced navigation paradigms and through a build-in (and thus highly integrated) search engine. Since Version 2.5 administrators are even able to choose between a native and an external (Verity) search engine.
Hyperwave is more powerful than other Web servers combined with search engines due to the following features:
- Context based : As mentioned before, the user can flexibly select the search scope. Even in the newest versions of Netscape (Netscape’s Enterprise Server 3.0) users can only use predefined search scopes.
- Build-in full-text : Every HTML and text document is automatically full-text-indexed on insertion. Using Verity’s search engine, even PDF and various MS Office files are recognised and included in the index.
- Meta-data : Traditional search engines have the (unsolved) problem of not recognising the semantics of a document. Hyperwave lets the user define various kinds of Meta–information and keywords. The search for this data is much faster and enhances the quality of retrieval.
Access control in Hyperwave happens at the object level. Rights can be defined for individual users or groups. With those rights, the visibility of objects like links or documents can be controlled. So different groups of users get a different perception of the Information.
Computer Supported Cooperative Work (CSCW)
In this thesis the term CSCW is used as abbreviation for the topical area and groupware for those products or applications supporting work groups.
The term "computer-supported cooperative work (CSCW)" was coined by Irene Greif and Paul Cashman in 1984 as a marketing tool for a vision of integrated office IT support -- "...A shorthand way of referring to a set of concerns about supporting multiple individuals working together with computer systems." [Whit96].
There are three areas of research:
- Development of a general understanding of teamwork and coordination.
- Development of concepts and tools for the support of distributed work processes.
- Evaluation of these concepts and tools.
Groupware is software that supports and augments group work. It is a technical term meant to differentiate "group-oriented" products, explicitly designed to assist groups of people working together, from "single-user" products that help people pursue only their isolated tasks [Greenb91].
The goal of groupware is to make the process of people working together more effective. This compares with previous desktop computing innovations – word processing, spreadsheets and the like – that made individual users more productive. CSCW-supporting software is called Groupware.
Examples of groupware components are:
- Desktop conferencing systems,
- Videoconferencing systems,
- Co-authoring features and applications,
- Electronic mail systems and bulletin boards,
- Meeting support systems,
- Workflow systems, and
- Group calendars (automatic meeting scheduling).
Groupware is influenced by some factors:
- The person
- The task
- The organisational structure of the group
- The technology used
In the next two chapters, two different kinds of groupware "Workgroup Computing" and "Workflow Management" will be introduced. There are several reasons to differentiate between these two. [Prinz] e.g. uses circulation folders as workflow tools to support structured work processes and shared workspaces for workgroup computing to provide a working environment for less structured processes.
- The most significant difference is the focus. Workgroup computing focuses on the information being processed, enhancing the user’s ability to share information within workgroups. Workflow emphasises the importance of the process, which acts as a container for information (see [Koulop]).
- Workflow systems need a set of rules to define the steps for the problem solving. Workgroup computing is more flexible and spontaneous.
- The user controls "workgroup computing" tools. The user initiates the interaction. Workflow is defined at the beginning of the process and then the Workflow system initiates the necessary actions to finish the task (computer-mediated communication).
- The basic idea of Workflow is to divide the problem into several smaller sub-problems, which can be solved by different people. "Workgroup Computing" focuses on people working together at the same time to solve one big problem.
- The number of participants in Workflow systems can be large, but in workgroup systems, the number of people involved in the solution of a problem is still limited because of the difficulties in concurrency control and coordination.
Workgroup Computing is the application of a computer-based and commonly usable environment for the support of teams to fulfil their common tasks. Supported are primarily:
- Coordination of tasks (see chapter 5.2 for awareness, chapter 5.3 for concurrency control and chapter 5.4 for social conflict management),
- Communication (see chapter 5.5) and
- Collaboration (see chapter 5.6).
Workgroup Computing tries to create a virtual enhanced office or work space (see [Rosema96], [Fitz96], [Fahlen93]) where people can meet and work together to solve a problem in a group. The shared workspace is a communication medium (see [Mitche95]).
Workgroup Computing systems don’t define strict rules for the cooperation of these people but leaves it mostly to the people to coordinate their tasks. A strict concurrency control like that used for databases would be more restrictive than necessary (see [Munson96]). This is flexible and straightforward for asynchronous software, where people usually don’t work at the same time on the same problem, like Email. It wouldn’t make much sense to write an answer to a letter one does not have yet.
However, it gets more complicated with synchronous software. People work on the same document at the same time. In a real office, people are usually aware of the other’s action and focus of interest. The systems should encourage awareness to provide similar information about the shared context.
Workgroup Computing systems usually provide two ways of working together (also called two ways of communication): Direct communication and collaboration (indirect communication through shared artefact and common workplaces).
Workflow Management is the planning, simulation, execution and control of business processes and the providing of necessary tools and information. Studies of working behaviour have increasingly observed that the "coordination" of work is, itself, work (see [Dourish96]). Workflow systems offer to relieve users of the burden of coordination, by managing task coordination within the system, so that the user can focus on the work activities. Workflow technologies are increasingly going hand in hand with the popularity of business process re-engineering.
The emphasis in workflow management is on using computers to help manage business processes and boost productivity by eliminating overhead time spent in collecting and disseminating the information needed for performing tasks. Furthermore, workflow systems can be used to monitor the task.
Although usually used for clearly defined business processes workflow technology can also be applied to highly individualised processes. Ad hoc workflow requires the use of graphical workflow development tools that are easily created and modified by the end user (see [Koulop]).
Components of workflow systems are:
- Workflow editor: for graphical planning of processes.
- Workflow simulator: for the simulation and verification of business processes.
- Workflow engine: for the execution of business processes.
- Workflow monitor: for the controlling and monitoring of running processes.
Jonathan Grudin mentions 8 problems in [Grudin] related to groupware. Volker Wulf writes about conflict management in [Wulf97].
A key to successful groupware is flexibility. Different kind of users with different styles of working uses the software. The working style also changes over the time the software is used and in the different phases of the process. The software should run in different environments and should be usable for various tasks.
This chapter describes problems related to the creation and testing of software, to the necessity of flexible groupware which adapts to the environment, to the user and to the task.
- Difficulty of evaluating groupware : Groupware must often interface simultaneously to users with different and sometimes shifting roles, preferences, and backgrounds. Users can be tested in a laboratory on the perceptual, tactile, and cognitive aspects of human-computer interaction that are central to single-user applications, but lab situations and partial prototypes cannot reliably capture complex but important social, motivational, economic, and political dynamics.
- Complexity of software: Either groupware is specialised in a specific work task or/and in a special group of users or it is designed as flexible software. Some of the aspects of flexible software are:
- Different communication paradigms : Groupware can support different kinds of data exchange like video/audio (e.g. video/audio on demand), big files and chat texts. These data sources need different kinds of transmission (parameters are "amount of data" and "amount of data per time period"). Another important parameter is the QoS (Quality of Service): sometimes it is acceptable to lose some of the data. Most probably, it is acceptable for online video but surely not for the transmission of a file of important data. Does the data source need to be sure of the successful transmission (e.g. an automatic observation point which needs to report a significant increase of the waterlevel of a river). Sometimes the QoS is even different for different clients (the increase must be reported to the crisis team but not necessarily to the statistics team).
- Different levels of concurrency control : The concurrency control mechanism has to adapt to the different work styles and phases of work. For example, two people decide to co-write a book. They decide to define each chapter as a unit. In the brainstorming phase, both people want to add notes and ideas to all of the chapters at the same time. In the second phase, each author starts to write his chapters (without an interference of the other author); and in the third phase they want to edit the same chapters at the same time again, but just different paragraphs (the paragraph is defined as a semantic unit). One reviews the others chapters and the other one already implements the necessary changes. Therefore, they need to change the concurrency control policy over the different phases of the project.
- Different levels of awareness of co-workers : [Lee96] describes the use of Portholes (video awareness tools) to identify, develop and maintain work and social relationships for distributed groups by fostering a sense of proximity, accessibility and community-hood. Awareness creates the context of a virtual office or place to work. A problem is the sense of loss of privacy and feeling of surveillance and monitoring. Therefore, the software allows controlling the resolution of their image. In the example about the co-writing of a book (mentioned above) the two authors also need different levels of awareness of the others focus in the text. Too much awareness could result in an overflow of information and distraction; too little awareness causes increased concurrency and social conflicts. In the first phase for example the two people work highly parallel and need to know about the exact position of the others position in the text. In the second phase, they just need the chapter the other one is working on currently. In the third phase, they need to know the paragraphs edited by the co-worker.
- Different hardware environments : Most networks are heterogeneous. Organisations often use workstations and PCs in the same network. There are even different kinds of LAN or WAN. Some networks support multicasting, others need software emulation of that function. The QoS, transmission rate and time also play an essential role. The software needs a robust protocol, graceful degradation and extensibility to be able to react to changes in the hardware environment.
- Different software environments : Not just the hardware environment but also the software environment is heterogeneous. People use UNIX, MS Windows, VMS or other operation systems. Thus for an organisation to allow a flexible user and group management the software may need to be deployed on different platforms being platform independent (e.g. Java). In addition, Groupware often needs to be integrated in or integrates itself into other software packages like word processors. This way people still can use their own software, which again helps to get a higher user acceptance of Groupware. To be flexible, open protocols need to be established and used.
- Different group organisations : Groupware also needs to adapt to different group organisations and roles in the group. A hierarchical group for example often needs different mechanisms to resolve conflicts than a flat group. In a hierarchical group conflicts could be resolved by the decision of the group leader. However, if the people are at the same level they need to find a solution together.
- Exception handling in workgroups: Work processes can usually be described in two ways: the way things are supposed to work and the way they do work. A wide range of error handling, exception handling, and improvisation are characteristic of human activity. We have to recognise a large amount of ad hoc problem solving in human activity. This especially makes the creation of workflow tools a very difficult task because workflow tries to find a "standard process" for using it as a model.
- Different Categorisation of Groupware
The diversity of groupware applications is enormous, largely due to the lack of agreement as to the exact boundaries of the field. However, these applications can be categorised along a number of axes, as done here (see [Mitche95]).
- Focus on the user: The focus can be on communication between users. Information is delivered from one point to another through one-way channels, mostly from point to point, if it is synchronous communication with a large amount of data, like video or audio conferencing. Asynchronous communication usually uses server for delivering data, like email systems and discussion groups.
- Focus on the document : People are working on the same document. Concurrency control becomes important. Examples are whiteboards or collaborative editors.
- Focus on the process : In workflow systems the focus is on the process. Workflow systems are discussed in the next chapter.
- Unstructured work: In an unstructured work environment like a brainstorming session, concurrency control by the system is often unwanted. Social protocols are often used for the basic control of co-operation. Groupware only supports the team members by creating a common virtual workplace (even if the members are dislocated) with basic communication and collaboration functions and by improving team coordination (see chapter 4.1.3).
- Highly structured work : In applications dealing with structured data, people often have clearly defined roles, like mediators and observers. The application can take over the role of concurrency control and even controls the flow of information and monitors the progress of work.
Also of influence is the structure of data and semantic objects:
- Unstructured data : A whiteboard can be based on manipulation of pixels. Single pixels are not semantic units. So concurrency control is either based on the whole picture (one painter and one or more observers) or the application bases on social protocols to coordinate concurrent task.
- Structured data : If a whiteboard defines objects, like squares and circles, to create the image, concurrency control can base on object level.
- Hierarchical data : Books for example are divided into chapters and again into sub-chapters. Applications can adopt their granularity of concurrency control to respond to the need of the level of concurrency.
- Collaboration-awareness: Systems which are aware of the fact that several people are working in cooperation, are called collaboration-aware.
- Collaboration-transparency : Systems which do not contain semantics for collaborative tasks, are called collaboration-transparent. Single-user applications could for example simply lock documents before working on them. This leads to a simple turn-taking mechanism, which is just a limited kind of collaboration.
[Bentley94] identifies three levels of sharing:
- Presentation-level sharing : Each user looks at the same display of information from a common information space, also called WYSIWIS (What You See Is What I See).
- View-level sharing : Each user has a presentation of the same information, but the presentation may differ, also called "relaxed" WYSIWIS.
- Object-level sharing : Each user is working in the same information space, but different information is drawn from it (for example because of different access rights or preferences).
The location of the user can be either remote or local, but this distinction is not so important for collaboration software, especially for software based on the Internet. However, it is important for scalability and reliability. For software which is only used in the Intranet of a company or university, it can be easily estimated how many people will use the system; a system which is accessible from the Internet, has to scale very well. Furthermore, the Intranet network is usually more reliable than the Internet.
When projects become more global there is a need for new powerful tools to organise and manage the work in groups, which may be spread all over the world (for an example see [Fielding97]). A group of 5 to 10 people distributed over the whole world can be managed by using the telephone and fax. A group of 50 people in a company can be managed by regular meetings. However, a large group of people distributed in place and time needs better tools for management and information dissemination.
Think of writing a book with several guest-authors, like [Maurer96]. Of course, the group members could be called by telephone and information could be exchanged by fax, leaving to the other side to key in the information again. Alternatively, one could use Internet-based tools like email and a document management system like Hyperwave.
Let’s look at different kinds of system architecture, because the architecture defines some characteristics of the system like reliability, response time, ability for locking and scalability. Then a deeper look has to be taken into the topics of awareness, concurrency control, social conflicts, communication and collaboration. The chapter will be concluded with a look at two examples: Lotus NSTP and Hyperwave’s Document Management System.
If an application needs to be created for the Web, some important points need to be observed:
- 24 hour availability : If people use the application all over the world, 24 hour- availability is important because of the time difference.
- Scalability : The internet makes it possible, that the number of people using the service increases to more than 100 percent in just one day. With accessibility from all over the world, one can get millions of people using the system. Therefore, the system has to grow on demand.
- Response time : Response time is another important factor in satisfying the users of a system. This point is closely related to the second one. Using synchronous collaboration, like sharing a whiteboard, the system depends on quick reaction of the client’s interfaces to changes.
Calculation example: video/audio transmission with network bandwidth limitation
In a video conference, the amount of data created per time unit stays approximately the same (the compression ratio could change over the time). So there is the same amount of data created each second and has to be transmitted in a certain time limit (real time criteria), usually in the same amount of time it is generated. This makes sure that there isn’t more data created than can be transmitted. There is no hardware-supported multicasting.
The bottleneck in this example is the bandwidth of the network (most likely if the Internet is used). Another bottleneck could be the maximum amount of data the client is able to send per second.
The QoS (quality of service) is not so important in this example. If some packages of data (frames or part of frames) are lost the communication is still possible. Therefore, in this example the confirmation of the successful transmission of data is not necessary.
#m..............Number of group members
bn...............Average bandwidth of the network for point-to-point communication [ bit/s ]
dt................Amount of data created per second [ bit/s ]
tp................Average transmission time of one bit: is neglected in this calculation. It does not change the maximum number of group members or the amount of data which can be transmitted. This parameter must be considered in a hard real time system, because it increases the delay to get the information.
tu................Time unit to transmit the data
tt.................Maximum transmission time of the whole information to the group [ s ]
Figure 2 (one transmission)
This is a totally distributed system. Usually there is a central entry-point to get necessary information for joining a group, like the addresses of other group members. Then the group members just communicate to each other.
Microsoft’s NetMeeting and Netscape’s Conference can be mentioned as an example for such architecture. When these programs start, they optionally notify a public Internet directory, which are used by other people to look up "telephone numbers". However, the rest of the communication (when they connect to somebody) is just client-to-client.
This architecture is often used, if much data has to be transferred with short delay (like video and audio data) and locking and causal multicast is not so important. The bottleneck of a server is avoided and there is no additional time of transfer to the server.
If the group gets too big, the workload for the clients gets bigger, because they have to send the information to every other client. To overcome the problem the network could support multicasting protocols.
Characteristics of this architecture
- Fat client: The client has to deal with communication, locking and error handling because there is no server to deal with it. In addition, the client has to keep the list of other members of the group. So each client has an overhead of programming data and information.
- Locking : Distributed locking is more difficult than central locking.
- No central component : This makes the system very reliable. Even if some clients crash the others can still keep on working (after error-recovery). The only central component would be a service for getting the groups entry points. As soon as the client has that information, it doesn’t need the service anymore.
- Fast transmission of much data : there is no such bottleneck like a central server and the additional time for the transmission of data from client to server is not necessary. Thus, this architecture is well suited for real-time applications if the group is not too big.
- Scalability : the maximum size of the group depends on the amount of data to transmit per time unit, the bandwidth of the network and the amount of data the client is actually able to transmit per time unit. The number of groups is virtually not limited.
- Response time : the response time is very short, because the information is sent from point to point. The response time gets longer the bigger the group is.
Number of bits per second to transmit:
Every bit created has to be sent to all the other group members.
Transmission time for the information created in one second:
The amount of information created in a time unit has to be transmitted in at least the same time unit if not faster; otherwise, the information source would create more data than it can transmit.
Maximum number of group members:
#m is the maximum number of group members so that the amount of data created per second can be transmitted to all the members in a second.
Thus, the maximum number of users in a group is limited by the maximum amount of data the client can transmit and the amount of data created per time unit.
Figure 3 (one transmission)
This architecture is partly distributed, but has central components. One client is selected to be the server. Client-to-client systems often implement this type of architecture as well, to have a central point for storing some information (e.g. for locking). This central point is used as a serialisation point.
- Distributed mode: If the transmission of video/audio data is still done client-to-client, it behaves like a distributed or client system (see last chapter). In this mode, the central client, which serves as the group server just stores general group information and locking data. This just makes the locking easier.
- Client-server mode: If all the data transmission is done through the group server (it also serves as a multicasting point), the number of clients is limited by the amount of data created in the whole group and the amount of data the group server is able to transmit in a time unit. In this mode, the system behaves like a central or client-server system (see next chapter). Just the error recovery and reliability is better.
Characteristics of this architecture
- Fat client: like the purely distributed architecture, we discussed before, the software of these clients is rather complicated. However, just one client has to store general information like team members and locking information.
- Reliability : Every client is able to be a server, so if the group server crashes or is not available, another client can take its part and perform the error recovery.
- Locking : The "central" client is used to store locking information and other central data. This makes locking much easier, because the central client serves as a serialisation point. The decision making is done centrally by this group server and not distributed in the group.
- Scalability : the maximum size of the group depends on the amount of data the central client has to and is able to transmit per time unit and the bandwidth of the network. The number of groups is virtually unlimited. The server as the central point of entry just has to serve the information of group entry points.
- Response time : the response time is very short, because the information is sent from point to point. The response time gets longer the bigger the group is.
Number of bits per second to transmit:
Every bit created has to be sent to all the other group members.
Transmission time for the information created in one second:
The amount of information created in a time unit has to be transmitted in at least the same time unit; otherwise, the information source would create more data than it can transmit.
Maximum number of group members:
#m is the maximum number of group members so that the amount of data created per second can be transmitted to all the members in a second.
Number of bits per second to transmit:
Every group client (except for the client that plays the role of a server) has to send its information to the server.
The server has to send the information it got from the clients to all the other clients. In this example, the server sends the information to the whole group (except itself).
Transmission time for the information created in one second:
This is the transmission time of data from the clients to the server. Here the bandwidth of the network is the bottleneck.
The data from all #m clients (including the server) must be sent to (#m - 1) clients (not to the group server) from the server.
The amount of information created in a time unit has to be transmitted in at least the same time unit. Otherwise, the information source would create more data than it can transmit.
Maximum number of group members:
Therefore, the maximum number of group members decreases with the square root of the maximum amount of data the group server can transmit and the amount of data created per time unit. This protocol does not scale as well to the number of group members.
Figure 4 (one transaction)
In this architecture, different software is used for the client and server. So every part can be specialised in its task.
Characteristics of this architecture
- Thin client: the client just has to implement the communication protocol to the server. Most information is kept with the server.
- Easy locking : requests are automatically serialised; locking is easy because of the data being kept centrally and central decision making.
- Atomicity of message delivery could be easily implemented.
- One central process : If the server fails, the clients cannot continue with their work unless there is a backup server. This server is also a bottleneck of the system.
- Scalability : the central server is the bottleneck of the system. The maximum size of the group depends on the amount of data the central server has to and is able to transmit per time unit and the bandwidth of the network. The number of groups is limited, if they all use the same server.
- Response time : the response time is longer than in the distributed system, because the information has to be sent to the server and the server forwards it to the clients.
Number of bits per second to transmit:
Every group client has to send its information to the server, which is not group member.
The server has to send the information it got from the clients to all the other clients. In this example, the server sends the information to the whole group, even to the client from which he got the information.
Transmission time for the information created in one second:
This is the transmission time of data from the clients to the server. Here the bandwidth of the network is the bottleneck.
The data from all #m clients (including the server) must be sent to all #m clients from the server. In this example, the server sends the information to the whole group, even to the client from which he got the information.
The amount of information created in a time unit has to be transmitted in at least the same time unit. Otherwise, the information source would create more data than it can transmit.
Maximum number of group members:
Therefore, the maximum number of group members decreases with the square root of the maximum amount of data the group server can transmit and the amount of data created per time unit. This protocol does not scale as well to the number of group members.
The necessity of different communication paradigms for groupware was discussed in chapter 4.2. The demands can change in the different stages of work. In addition, different kinds of data need different sending paradigms [Gall].
This paradigm ([Mathur95]) is characterised by one or more data sources or publishers sending data to multiple recipients or subscribers by using publishers. A publisher multicasts data to a set of intermediate nodes, referred to as distributors. The distributors then route the data to other distributors or local subscribers. The direction of communication is just one way (from publisher to subscriber) and anonymous. The publishers are aware of their recipients, but the subscribers are unaware of each other and just aware of the publisher that they are receiving data from.
This paradigm supports a weak form of reliability for the subscribers. If one publisher crashes, the subscriber just searches for the next publisher and subscribes again. However, that way the subscriber could lose some information.
One of the specific design goals of multiprocessor operating systems has been to give each user the look and feel of being the only one on the system. One prints to queues to be able to print documents even if the printer is busy at that moment. Databases try to serialise concurrent operations of users to ensure that they have the same effect as operations which are performed one after the other. Information about other people using the system must be explicitly requested.
However, in groupware the awarenessofofof other people is crucial. Nowadays the members of a human team are often spread among several departments of an organisation or even live in different countries. Groupware interferes with and changes the subtle and complex social dynamics that are common to teams. Often unconsciously, actions of group members are guided by social conventions and by the awareness of the personalities and priorities of other people, knowledge not available to the computer [Grudin]. If collaborating teams use distributed applications for their work, coordination tasks are also carried out through (and should be supported by) the software system.
Workspace awareness creates a common working environment for the team members [Greenb96]. So the effort needed to coordinate tasks and resources can be reduced, people move easily between individual and shared activities, and a context is provided to interpret other people’s activities [Gutwin96]. Group awareness can be defined as "an understanding of the activities of others, which provides a context of your own activity" [Dourish92].
[Schlichter97] mentions four types of awareness:
- Informal awareness of a work community is basic knowledge about who is around in general (but perhaps out of site) or who is "physically" in the same room.
- Group-structural awareness involves knowledge about such things as people’s roles and responsibilities, their positions on an issue, their status, and group processes.
- Social awareness is the information that a person maintains about others in a social or conversational context: things like whether another person is paying attention, their emotional state, or their level of interest.
- Workspace awareness is the up-to-the minute knowledge a person requires about another group member’s interaction with a shared workspace if they are to collaborate effectively.
There are some reasons why the groupware system should filter the awareness information before it is brought to the user’s attention. In different phases of collaborative work it is necessary to switch between individual and shared activities. In these phases the level of awareness needed for efficient teamwork changes.
It is very important for most people to have a certain level of privacy. Let’s take the example of a telephone as a tool for enabling groupwork. If the call is not accepted the calling person does not know whether the person being called is not in office or is just too busy to pick up the receiver. This information would be helpful for the calling person to decide whether to call again in a few minutes. If the person being called is busy, it is very disturbing to be called every five minutes. If the person being called is just not in the office, that is no problem. For the person being called this lack of information means more privacy [Lee96]. These different interests of people with different roles can be called social conflict (see chapter 5.4).
In other phases of their work, when the people need more active collaboration to finish their tasks, team members could decide to provide more information about themselves to facilitate collaborative activities.
In a real office people permanently get information about what is going on at that moment. This can be disturbing. In distributed collaborative software, filters could reduce the amount of awareness information to avoid information overflow. Again in some phases the level of awareness needs to be higher.
[Gutwin97] mentions some general issues that complicate the search for general and transferable awareness requirements.
- Domain specificity: Much of what a person needs to know about others depends heavily on the application domain and the person’s own role in that domain (e.g. information distributor or observer).
- Information importance: Some awareness information is crucial for the completion of a shared task. Other information is beneficial but not critical. Team members need to be aware of critical information, but they should be able to decide whether to be informed of additional information. Usually it is easy for system designers to find out which information is critical (e.g. a countdown for a system shutdown), but it is more difficult to find out by what additional information the collaboration is supported.
- Changing requirements: As mentioned in chapter 5.2.2 the optimal level of awareness changes over time, because people shift their focus between individual and shared tasks. Adaptable filters could provide flexible amounts of information.
- Effects of expertise: As people become more familiar with a domain, a task, the software, and a group of collaborators, they are able to infer more and more about other people’s activities from smaller and more subtle perceptual signals.
- Evaluation: Awareness is not a quality that can be easily measured, and showing the benefits of awareness support in groupware is difficult at the best of times. Evaluation is complicated by the lack of a clear cognitive theory of what awareness is and how it works. Studies of awareness support in groupware cannot rely only on time and errors. ([Greenb97], [Gutwin96])
When two or more users work jointly together sharing one object, there is a need for the synchronisation of their actions to ensure the consistency of the object. Conflicts which can be resolved by the system itself will be called "software conflicts" in contrast to "social conflicts". Software conflicts are conflicts through the multiple access to an object or semantic unit at the same time, network bandwidth and transmission problems or other hard- and software problems.
Concurrency control has been used for a long time in the area of database systems. First, the drawbacks of restrictive concurrency control of database systems used in the field of collaboration and groupware are discussed. Secondly, it has to be talked about different categories of collaboration control. Then the requirements for dynamism in groupware concurrency control are outlined.
[Munson96] addresses four drawbacks of traditional database concurrency control when used in collaboration systems.
- Traditional database concurrency control is generally too restrictive for collaboration systems. Database transactions consist of simple read/write operations. The semantic of the shared objects of collaboration is usually more complex. So conservative database-like concurrency control is often more conservative than necessary. If for example people work concurrently on writing a book, then blocking the whole bibliography chapter for insertion is unnecessary. Concurrent entries could be allowed (risking redundant entries that can be removed in a consolidation phase), but it could be necessary for the chapters of the book.
- Traditional database systems do not allow concurrent transactions to mutually depend on each other. Users of a groupware system may be expected to influence each other. Let’s take again the example of jointly writing a book. If one author needs to insert a new bibliography citation and observes another author doing the same, this author can see if the other one tries to insert the same citation before he has finished.
- Collaborative systems users may wish to temporarily allow conflicting actions and delay their resolution until some later time. Conventional database systems do not allow the database to remain in an inconsistent state for indefinite periods. In a brainstorming session, this could be desired; or joint authors of a book may independently add bibliography citations and leave removal of duplicates to a later stage of their work.
- When a conflict is identified, a conventional database system will throw away all work that led to the conflict and returns the database to a prior consistent state. For a user about to commit a large number of changes to a document, this would be unpleasant and unnecessary. Maybe only the changes of some paragraphs need to be discharged.
- Different kinds of data require different levels of consistency: Brainstorming sessions for example need highly interactive work. If there is a document, which is passed from one public servant to the next one to be completed, it needs to be assured that people who have finished their work are not allowed to change anything afterwards.
- Different modes of collaboration require different kinds of awareness : Too much awareness distracts people from their work. Not knowing enough could lead to misunderstanding and conflicts. The level of awareness often depends on the level of concurrency.
- Social Conflict Management
- System-Supported Social Management
- One-Sided Controlling
- Activation-Related Transparency
- Annotation Support
- Negotiation Support
The last chapter was about system controlled concurrency control. However, there is another type of conflict potential besides "software conflicts": "social conflicts" (see [Wulf97] for more details)
Groupware affects and is affected by the social structure of the group. Let’s take group calendar software for meeting scheduling. The problem is to find a free time slot of a group for a meeting. The usual procedure of the group for example is to ask the group manager to organise a meeting and he asks the group members about possible time slots. The group manager doesn’t just look at their calendars, he also asks everyone about preferred times and finds the optimum time this way.
If the software allows everybody to enter new meetings in other calendars, this would change the group’s procedure. Thus, the group has to agree on certain procedures and stick to them. However, it would be helpful if different ways of controlling are supported by the system.
Another thing is the necessity to exchange informal information. If there are two possible time slots for a meeting people might prefer one because of some reasons (less fragmentation of one’s own time schedule for example). It could be helpful to the group if the system supports (in-)formal negotiation of different solutions.
Lets look at groupwork as group members activating functions, which affect other group members. A function could be the changing of a text (which could be used by other members at the same time) or the calling for a meeting with a calendar software or the changing of a record, which could be used by others as well. For a generalisation, let’s say that one activator tries to execute a function, which affects one or more group members.
This could cause conflicts of interest. Let’s take NYNEX Portholes ([Lee96]) as an example. Every computer has a video camera to observe its user. For the other group member this additional information could be useful to see if this person is too busy to be interrupted or if this person is not there at all. However, the observed person experiences the loss of privacy and surveillance.
Another conflict could be that the group leader calls for a meeting at a time when one or more group members are too busy.
The system should support a way for communication and information flow to resolve these social conflicts. Information is necessary to better understand the position of the activator and the affected person. The activator for example could inform the group members that a meeting is necessary because the goals of the project changed. The affected person could inform the activator about problems concerning the execution.
If the system supports a two-way information flow between activator and affected person, they could negotiate a solution through communication.
The information flow between the activator and affected person can be formal (choosing one information out of n) or informal (chatting with the other one). Formal information is limited but easier to be analysed by the system. Informal information could explain the problem more exactly but takes more time to create and evaluate.
If information flow is necessary to resolve a broad range of problems, a mixture of formal and informal information could be the most efficient way.
Below is a description of six different ways of system-supported (not system-controlled!) social controlling. For more details see [Wulf97].
One person checks if the execution of a function violates the group’s rules and executes it if possible.
A typical function of group calendar software could be the insertion of a meeting into the calendar of another group member. This person is the activator of the function. In a group calendar software this means that everybody can add new meetings to other calendars as long as there is no other meeting at the same time (necessary criteria) and as long as the person has the right to activate the function.
The affected person is not notified of the activation of the function. There is neither communication nor information flow. The system does not support the resolution of a social conflict.
One person (the activator) activates a function. The affected person is not notified of the activation (before or after the execution of the function itself), but the affected person can pre-define an automated reaction of the system to the activation.
As long as a group member for example is not sure if he is there on the next Tuesday, he might prevent an entry to his electronic appointment book.
Again, the system supports neither a communication channel nor the information flow between activator and affected person. However, the person is able to pre-define certain preferences.
One person checks if the execution of a function violates the group’s rules and executes it if possible. Again, this person is called activator. The affected person is notified of the activation (before or after the execution of the function itself), but the affected person can not control or prevent the execution of the function.
The system does not support a communication channel between activator and affected person, but there is a one-way information flow. This flow of information could be used to start further non-system-supported social conflict management (like making a telephone call to the activator to complain about the time of the meeting).
One person (the activator) activates a function. The affected person is notified by the system and decides if the execution of the function should be cancelled or performed. There is a timeout for the reaction to make sure that the conflict is resolved in a certain period.
Here the system supports the one-way information flow from activator to the affected person, but there is still no communication channel between these two. The activator just gets feedback which says if the function actually executed or not. If the affected person aborts the execution, the activator does not know the reason.
Again, this system-feedback can start further non-system-supported social conflict management.
One person (the activator) activates a function. The affected person is notified by the system and may send back (in-)formal information concerning the activation of the function.
The system supports the one-way information flow from the affected person to the activator, but does not allow intervention.
This is the most advanced type of system support for dealing with social problems. The activator activates the function and the system automatically opens a two-way communication channel to the affected user. So the user is informed about activation of the function and decides if he allows the execution of the function. If he wants to change some parameters of the function, he sends (in-)formal feedback to the activator and asks for these changes.
So the activator and the affected person can negotiate about the execution and about parameters of the function by having a system supported discussion. The system supports both the information flow in both directions and allows the intervention of the affected person.
Communication is used in this thesis as a one-way information flow from a source to one or more sinks (Communication was explained in chapter 5.4 as a two-way information flow, but this chapter takes a more general look at the communication paradigm). Two-way communication is done by using two different channels where each party is source and sink at the same time.
One channel does not influence the other one. So there is no need for concurrency control but maybe there is need for social problem management. The sink is aware of the source, but not necessarily the other way round.
For general groupware (or tools for creating groupware), the communication protocols need to be flexible.
- Amount of data: People can communicate by typing text (e.g. chatting, email), which creates an amount of some bytes to some Kbytes per minute. Audio or video channels need several Mbytes per minute.
- Quality of transmission: The loss or corruption of information of a video channel possibly causes the loss of (part of) a frame. The more information is lost, the jerkier the movements become. In an audio transmission, the loss of information could lead to misunderstandings. A corrupted text file could even be useless.
- Timing and synchronisation requirements: Synchronous groupware needs to transmit the information within a certain time or else it must be considered lost (continuous media stream [Simon94]). Asynchronous groupware has more time for the transmission. Sometimes different channels (like audio and video channels) need to be synchronised.
- Group size: If multicasting is not supported by the hardware of the network, the same information needs to be transmitted to all the group members. Therefore, the protocol needs to scale to the group size.
To provide flexible service for example Corona in [Hall96] supports two different communication paradigms:
- Publish-subscribe: This paradigm scales to the group size but the source is not aware of the data sinks (see 5.1.4).
- Peer group: This paradigm provides reliable data delivery and dynamic awareness notifications but does not scale well.
Let’s take a collaborative text editor with multiple cursors and multi-user scroll bars, as used in [Gutwin95]. The multiple cursors are a means for supporting fine-grained awareness of location and activity of other group members in the text. The scroll bars show at which position in the whole text the other members are working. It gives the editor an idea how (semantically) near the other editors are working.
There are different approaches to coordinate the activities. The rules can be defined by the group members or the group leader without the program support to enforce the rules. The members themselves need to follow the rules. This way could be used if the software is not groupware aware. If the rules change, all the group members need to be informed to ensure that all the editors apply the same rules.
Another method is to implement the rules in the groupware application. This makes the software more inflexible because rules can change with the different phases of the object’s creation (see also chapter 4.2.1).
A third way could be to define some basic rules and to use the communication tools to solve problems when they arise. That way, social conflicts could be solved or prevented as well. This method is the most flexible and adaptable.
In a collaborative editor, a social conflict could be for example, that a co-editor changes the structure of a sentence of another editor. The other editor changes the sentence back to the original structure, because he prefers it that way.
Traditional software usually prevents this conflict from happening. The author is the owner of that part of the text and the other people just have limited access to it until he unlocks it. Groupware should provide services to uncover possible conflicts.
A co-editor could annotate this sentence and suggest the changing of the structure. This could start a synchronous or asynchronous communication between these two editors (and the whole group) to come to a solution.
The (informal) exchange of information in a group creates a common knowledge base within the group. [Acker96] describes the problem of getting help and assistance especially in distributed communities. It is suggested to use the collective memory of the group (group members, repositories...) before consulting bigger (e.g. international) communities.
For a computer user with a software problem it is more efficient to ask the IT expert of one’s own group before posting a question to the Internet or asking the software producer for support.
Furthermore, the communication service could be used to strengthen the sense of unity in the group and to exchange personal information. That way the feeling of being isolated could be reduced.
While communication is defined as information-centric, collaboration is defined as object-centric in this thesis. A group of people are working together to create or/and change an object. To enable collaboration, a common workplace with shared artefacts is necessary (but the workplace does not necessarily need to be centrally administrated). Therefore, collaboration is also called "indirect communication" or the "shared manipulation of common artefacts".
A typical example of collaboration is the joint creation of a text (e.g. a book) like [Maurer96]. There is one main author or the group leader and there are one or more co-authors. The object they are working on is the text or book. Today’s applications can be separated into two major categories based on:
- Object structure and concurrency control
- Awareness and social conventions (self-restricting)
Hyperwave uses this approach with its document management system (of course Hyperwave is more than just a document management system based on the Internet). Editors can use check-out and check-in functions to ensure that there is no concurrent editing of the text. Other editors can see who has checked out the document. However, they cannot see if he is editing the text at the moment or which part he is working on. This is traditional locking commonly used in databases and file systems. To enable concurrent work, the document can be split into sub-documents for semantic units like chapters or more fine-grained units. The user can define the granularity by hand.
For discussing parts of the document, other editors can use annotations. The awareness is restricted to knowing who is locking which part of the document (which unit or object) at that moment.
This approach uses simple concurrency control to ensure (or better said enforce) the integrity of the document. The more highly the structured the document is the more concurrent the work can be.
The disadvantage is that concurrency is restricted, a fact not desirable in e.g. brainstorming sessions. The advantage is that users are used to this approach and that they can use their own text editor. No special text editor is necessary.
In [Gutwin95], [Gutwin96b] and [Greenb95] highly concurrent text editors with different widgets for awareness support are described. All editors can work on the document at the same time. The editors can see the position of the other users in the text. That way the users can avoid conflicts, like two people editing the same sentence at the same time.
Awareness of the other user’s actions is necessary for coordination of the tasks in this approach of collaborative text editing to create the sense of working on a (physical) common workplace. No concurrency control is enforced by software. Therefore, the users must act self-restricting (and observe social conventions of the group).
The advantage of this software is that concurrency is much higher, because there is no restriction. However, changing the parts of other editors may cause social (and concurrency) conflicts. Therefore, communication facilities would be helpful. A disadvantage is that users are restricted to one text editor, which makes the introduction of this new software more painful for the user.
NSTP stands for "Notification Service Transfer Protocol" for synchronous groupware. NSTP is an infrastructure for building synchronous groupware like chat systems, shared whiteboards, group decision support systems and shared worlds. The key feature of synchronous groupware is WYSIWIS (What You See Is What I See), meaning that the user interfaces of the participants must be kept consistent to promote the impression of working or playing together.
NSTP provides a more general solution for this problem. It provides an open service for supporting the construction of such groupware applications.
- Client-server-based: The shared state is stored at the notification server, where it can be changed by clients. Each change of this shared state causes the server to deliver notifications to clients, who are interested in the change. That way WYSIWIS can be implemented.
- General: The server supports centralised state with decentralised semantics. That is, the server does not understand what the shared state means for the clients; is simply accepts updates to the state and notifies interested clients (in contrast to the Mushroom Project, see [Kindb]).
NSPT uses a model of Things and Places. The shared state is divided into Things, which are located in Places.
Things are composed of two parts: a mutable value and a small number of immutable attributes that determine how Things are treated by the server. The value of a Thing is not going to be interpreted by the server. That’s why semantics is decentralised. The attributes are set at creation time.
Each Thing has an associated lock that allows a client to ensure that he has exclusive access to that Thing over a series of operations.
A Place is a container of Things. A client can enter a Place and leave it again. Things and Places are not persistent; they are lost at the server if it crashes or terminates. Places cannot contain other places, which does not allow a hierarchical construction.
If a user enters a place, a user-Thing is created automatically. That way the other users in that place are notified of the creation of a Thing and the entering of a person.
A Place serves four distinct purposes:
- It is a naming scope for Things: Each Thing has a unique name within its Place.
- Granularity: It defines the granularity at which a client can control the server’s delivery of notifications. A client just receives notifications of changes of the state, if it is in the place.
- Consistency: It also defines the granularity at which the protocol guarantees consistency, atomicity and ordering.
- Access control: Places are part of the access control mechanism for Things. Some Things are visible (or even writable) for clients outside the Place, while others require that clients be inside the Place to see or modify them.
The Façade is a window to the Place. Things, which are visible through the Façade, can be used to get general information about the Place.
Information to get:
- Place name
- Place type
- URL for a Java class that can run in a client and interpret the Place: this would make the Place dynamically expandable. Programs are maintained at the server side and downloaded to the client on demand.
- Parent place: that would enable a sort of hierarchical structure of Places.
The user is notified when Things are created, changed or deleted. So the user is informed about what is going on in that place at that moment.
A new user-Thing is created by the system whenever a user enters the place and is deleted again, when the user leaves. That way, all the users get a current user-list of the place.
Awareness is localised. Just the user who is in that place gets notifications about changes. The user can define the degree of awareness himself if he filters some type of messages.
Things can be used for indirect communication (see Chapter 18.104.22.168). NSTP also provides basic services for direct communication with other users in that place. However, efficient communication facilities are not provided yet.
Chat systems can be easily created with the help of NSPT. Clients just have to enter the room and handle notifications for Thing creation and value change.
- New user entry: The server automatically creates a new user Thing. That way the client is notified.
- Posting a message: The message can be stored in one’s own user Thing. So each time a client posts a message, the user Thing value changes and again the other clients are notified.
- Leaving the chat place: The user Thing will be automatically destroyed by the server and the clients are notified.
Whiteboards need communication that is more sophisticated. There is especially the problem, that latecomer should get the information about Things which have already been created so that they see the same whiteboard screen as all the other users.
- One Thing per object: Every object (circle, box, picture…) is represented by a Thing. This Thing stores information about location, colour, size and type of the object.
- Latecomer support: When a new user enters the place, he just fetches all the Things which exist in that Place and redraws his screen accordingly.
- Participant awareness: The server automatically creates a new user Thing. That way the client is notified.
- Concurrency control: If one user selects one object for changing its size, colour, … he has to lock the object first. So exclusive access can be guaranteed.
- Defining the priority of display: an additional z-ordering Thing could be used to maintain a list of Things and their order of display.
Two approaches to the problem of common text editing or creation were introduced in chapter 5.6.1. Because NSTP is a general protocol, both approaches can be theoretically implemented (object persistence is missing).
- Concurrent approach: Each unit of the document is a separate object. If a user wants to edit one unit, he locks it, edits the text and unlocks the object. The other client systems are notified of a newer version and update it locally.
- Awareness approach: The whole document is one object. Client information (like the cursor position in the document, and each activity) are stored in user Things. If a user changes his position or performs any keystroke the other users are notified (this creates a simple kind of awareness of the other user’s activities) and the locally stored document is updated accordingly. The disadvantage of this solution is that it does not scale well and work efficiency decreases with the increase of the team size (distraction through awareness information overflow and less efficient control mechanism).
Workflow is becoming increasingly interesting, because organisations are attempting to increase quality and reduce cost by modelling and improving their internal processes. Workflow enables organisations to capture not only the information but also the process, including the rules that govern its execution. These rules include schedules, priorities, routing paths, authorisations, security, and the roles of each individual involved in the process ([Koulop]). That way the workflow management software helps to ensure integrity and reliability of the process.
Not all current workflow systems support all the mentioned services. Some workflow systems for example support only planning and investigation of business processes without automation.
- Support for coordination: Organisations are attempting to increase quality and reduce cost by modelling and improving their internal processes. Coordination (workflow) software can capture and coordinate these processes. So workflow systems consists of two main components:
- Capturing and planning of processes: That way already existing processes are inspected and (maybe) visualised to recognise problems of the existing structure. In many cases, just the act of analysing business in terms of processes leads to breakthroughs in efficiency without automation [Abbott94].
- Support for and automation of the process execution: On the one hand, this should ensure consistent quality of output and reliability of the process. On the other hand, coordination work can be passed from the user to the system. End-user productivity is improved by eliminating overhead time spent in "setting up" and "tearing down" (e.g. collecting and disseminating the information needed for performing tasks) [Abbott94].
- Support for transparency :
- Transparency of the process (definition): This makes it easier for people joining a group, because the workflow system provides a clear context for performing that work.
- Support for status monitoring and reporting: Group members get current information about the progress or history of events in a specific process; tasks, which are overdue, are notified to the administrator by the system, etc [Abbott94].
- Support for reorganisation: Because of the transparency of process definition and the support for online monitoring, problems in the organisation can be detected earlier and more easily. With the help of the workflow system the re-engineered business process could be simulated and tested.
- Creates a work environment: The workflow system controls the access rights of files and transfers necessary information to the user’s workspace to complete the task.
- Reduces unnecessary communication: Because administration, routing of information and status monitoring can be automated, the need for communication (organisation documents and information; verification of the status of the object) decreases. The working context is provided by the system (not through awareness of other people in the same virtual workplace like in chapter 5.2).
Although the features sound promising, workflow systems are still not very commonly used. One reason could be that people feel they are being controlled by the computer (e.g. project management systems). However, there is a stronger move towards these software tools because of the need for more cost efficiency and quality control.
There are still technical problems aside from this social one.
- Cost of process definition: Although the advantage of process definition is evident, considerable effort is needed to define the rules. Conflicts between the people who have an "idea" about the organisation and the (computer) experts implementing the rules could make it even more difficult to find a clear enough definition of the process. New workflow systems (especially ad-hoc workflow systems) with graphical user interfaces for easy process definition and visualisation address the need for adaptability if business processes change frequently.
- Inflexible definition of rules and processes: In many cases, these business rules cannot be simplified. A too clearly defined procedure would prevent the potential of "creative chaos" in the office. Stringent restrictions on the activity have traditionally been associated with workflow techniques ([Dourish96]). This could cause a considerable loss of innovation and flexibility.
- Exception handling: Various kinds of exceptions can happen in organisations (e.g. a document needs to be routed to a team member who is on holiday at that moment, but the deadline for finalising the document is soon). For these situations differing from standard (defined) situations, actions need to be defined.
Workflow systems are still new technology. Computers were not as widely used 10 to 15 years ago. It is hard to apply a computer system to help manage work when very little of the work itself is done using a computer ([Abbott94]).
Nevertheless, different generations of workflow systems can be identified.
Workflow capabilities are used in particular applications, like document management or image management systems. Usually these systems are inflexible. The process definition (rules) is hard-coded and the systems are proprietary and closed.
The missing flexibility makes the system unsuitable for adjusting it to changing business rules.
In the next generation, the workflow system is separate from the work-tools like text editors. That way the user can decide which tools to use to perform the tasks. This also causes more acceptance on the user’s side. However, the integration is not as good as in the first generation any more.
The process rules can be adjusted by using script languages. Therefore, experts are necessary to adjust the system. Because the adjustment with script languages takes some time and effort these systems are not suitable for ad-hoc workflow.
In today’s systems, the workflow system is still separate from the work-tools. However, generic workflow services are accessible to other applications through APIs. The systems are more open and based on standards.
The adjustment to changing business rules is done through GUIs. Because expert knowledge is not needed so much anymore, these systems are more suitable for ad-hoc workflow.
The next generation of workflow systems should be fully integrated. The services should be integrated with other middleware services, like email, desktop management and directory. There should be standardised interfaces and interchange formats for information interchange between different workflow systems.
Applications should be workflow-enabled to be able to use the services. The workflow system will be ubiquitous in the user’s electronic work place but invisible.
Ad hoc workflow systems should be used if highly individualised processes are needed for each document. This could happen if the workgroups are dynamic or the business rules are changing or not clearly defined.
Ad hoc workflow requires the use of graphical development and definition tools that are easily and quickly modified by the end user who is not a workflow expert. A problem could be to prove the correctness of the defined process and correct exception handling.
Email-based workflow systems seem to be promising. They can use standard email applications (which makes automation more difficult) or proprietary email environments that just use the standard email exchange protocol. However, email systems seem not to be suitable if status monitoring and protocolling is necessary.
Transaction-based workflow systems are used for production-based and high-volume environments. The process is highly structured and includes complex tasks. The rules can be precisely defined and executed like in factories.
These systems need a static environment. The tasks vary little among cases. That way the workflow system can be highly integrated. Efficient throughput is the primary concern.
The purpose of object-oriented technology is to enhance the developer’s ability to create complex applications, to increase the integrity of these applications, and to create an interface for both developer and user that is easy to navigate and use ([Koulop]).
Like object-oriented programming languages, basic features are encapsulation and inheritance for object libraries and graphical environments. One instance of a workflow object consists of both the information and the process rules. Therefore, the rules may change and objects, which were instantiated before the time of change, still use the old rules.
Without an object-oriented approach, it would be necessary to wait until all current work in progress is completed before changing the workflow rules. On the other hand, one instance of a workflow object can be modified while in progress without affecting the other objects. This makes object-oriented workflow systems suitable for ad-hoc workflow.
Libraries can be used to reuse and recombine already existing business rules.
Knowledge-based workflow systems address the problem of the complexity of business processes. As mentioned in chapter 6.2 business processes are often too complex to describe them with some rules. The result could be inflexibility.
Knowledge-based workflow uses statistical methods, heuristic methods, and artificial intelligence to infer correct routing, scheduling, and exception routing. This reduces the problems of anticipating every rule and variable that may impact a business process.
Mail-centric systems use existing and widely accepted modes of electronic communication, especially email, as the delivery service. Many of these utilities are bundled with forms-based utilities like Microsoft Exchange Server.
The advantage of this system is the simplicity. Email protocols are simple and widely accepted. That way information interchange between different workflow systems is easier.
The disadvantage is, that status monitoring is not so easy because it is a distributed system.
Documents are associated with owners, applications, and rules that govern their routing and processing ([Koulop]). The focus is on the document. Many of these systems also provide facilities for document management, like check-in/check-out, locking, and versioning features.
These products often rely on underlying databases to store necessary workflow information and for protocolling. Status and statistical information can be easily retrieved, because all the information is kept centrally. The whole work process is transparent.
Creation of Internet-Based Groupware
The rapid evolution of the data communications infrastructure is making distributed projects increasingly viable. Without a common infrastructure, computer-supported collaborative tools for distributed teams have been prohibitively expensive to build and maintain. However, the increasing availability of the Internet is enabling companies to develop cost-effective collaborative solutions [Ly97].
Increasing numbers of organisations are abandoning the earlier networking model of competing proprietary protocols. Instead, these enterprises are simply adopting Internet protocols as a common networking infrastructure that can be used for everything from serving web pages to retrieving email to running client-server applications. Applications based on open Internet application protocols enable organisations to simplify and enhance their communications with business partners, suppliers, and customers.
This broad use of Internet technology is now supported by the existence of open application standards that offer a range of features and functionality across all client and server platforms:
- HTML (and in future XML) and HTTP support platform-independent content creation and publishing and information sharing.
- Simple Mail Transfer Protocol (SMTP), Internet Message Access Protocol (IMAP), Multipurpose Internet Mail Extensions (MIME), Secure MIME (S/MIME), Network News Transport Protocol (NNTP), and Real Time Protocol (RTP) are just a few of the available standards that provide email, discussion, and conferencing capabilities, allowing for platform-independent messaging and collaboration.
- Lightweight Directory Access Protocol (LDAP) stores and delivers contact information, registration data, certificates, configuration data, and server state information; these services provide support for single-user logon applications and strong authentication capabilities throughout the Internet. X.509 certificates provide a secure container of validated and digitally signed information; they offer strong authentication between parties, content, or devices on a network including secure servers, firewalls, email, and payment systems; they are a foundation for the security in S/MIME, object signing, and Electronic Document Interchange over the Internet. Simple Network Management Protocol (SNMP) offer network management capabilities.
Several specific strengths of the WWW make it very attractive as platform for developing collaborative applications (groupware), including:
- The wide availability of web browsers on a large number of platforms : This is an important point for organisations with a heterogeneous network of different computer platforms like PCs, Apple Macintoshes, Sun workstations, etc. Even if the Intranet is uniform, cross-platform availability becomes important if organisations want to deploy applications (up-to-the minute product information, article-ordering applications) to their clients as well.
- The core technology being based on a set of widely accepted standards such as HTML, MIME types and Internet naming : This makes it easier to exchange files between different platforms and legacy programs. Even proprietary programs like Lotus Domino start to provide gateways for email and LDAP, etc.
- The coverage and extensibility of the resource naming conventions (URL) : Files can be accessed from all over the world, if the URL is known, because URLs are unique. The URL does not just point to the file, but also defines the protocol to use for the access.
- Relatively inexpensive : There is a huge amount of software available, many programs are even for free like the Apache web server or the MS Internet Information Server. Java programs and ActiveX controls can be downloaded from freeware servers like Gamelan.
- Thin client: The client just needs the browser installed on its computer, the data and program is kept centrally on the web server. Software should be loaded from the server on demand.
- Installation and maintenance: The applications either run on the server or are downloaded and installed automatically on demand. That way, manual software installations and updates should be kept to a minimum.
- Programming Interfaces
- Location of Software Execution
- Server-Sided Software Execution
- Client Sided Software Execution
- Security Issues
- Server-side Security
- Client-side Security
There are several ways to run programs for the Internet. An important differentiation is the location of the program execution.
As mentioned before, people realised fast that WWW was good for simple information providing, but not for being a platform for application development. HTML pages were static and made it almost impossible to provide the user with the latest and dynamic information. So server sided programming interfaces were introduced first to overcome this weakness.
The advantage of server sided development is that the hardware and software platform is known. Because all the information and programs are stored centrally updates are easy to manage as well. Therefore, platform-neutral software development is not necessary. The software is better integrated and utilises services of the local operating system.
A disadvantage of this approach is that the server needs to scale well. All the programs and every instance of a program need to be executed on this server. Reliability through redundancy and workload distribution is an issue to consider. Another disadvantage is increased network traffic. For every interaction with the user, the document needs to pass back the information to the server and a new document has to be generated.
Exclusively creating an application on the server-side has one major drawback. It increases the number of document-turnarounds. Each time for example, the user fills out a form the information has to be sent back to the server to be checked for correctness instead of utilising the computing power of the client to perform this easy task. This increases the time needed to finish a task and network traffic is increased.
The workload of the server also increases drastically. So this solution would not scale very well. On the other hand the computation power of client computers increases very fast and is not used at all.
The problem is that the designer does not know the user’s hardware and software platform. There are two different approaches to this problem.
Microsoft’s ActiveX controls can be developed for a certain environment (usually Microsoft Windows, but not necessarily, because platform-neutral programming languages like Java are supported as well). If other environments need to be supported, different versions of that control have to be created.
Unfortunately, distribution over networks poses potential security problems, because the software must pass through many intermediate machines before it reaches the user's machine. Although millions of users download software every day without incident, the potential exists for either malicious or accidental damage to an individual user's system and data. The user often has no reliable way of confirming where a piece of software is really coming from, whether it has been changed in transit over the network, and what kind of damage it might be capable of causing.
Security features vary in different applications. In general, there are two different security concerns.
On the one hand, the server has to restrict access to certain information to privileged users and hide it to others. This issue was not addressed by 1st generation Web tools, because they only allowed downloading of multimedia documents and no authoring on the Web. So there was just the policy of "read-only" for all documents.
Modern 1st generation tools like Netscape’s Enterprise Server 3.0 and 2nd generation Web tools like Hyperwave address this issue and integrate security checks into the server.
Another problem is that users are allowed to execute programs on the server. This could be used as a security breach. Some programming languages run in a controlled sandbox environment to restrict possible damages. CGI usually restricts executable programs to a certain subdirectory.
Same Origin Policy
When loading a document from one source, a script loaded from a different source cannot get or set certain predefined properties of certain browsers and HTML objects in a window or frame (like window and document object).
Navigator defines the source as the sub-string of a URL that includes "protocol://host" where host includes the optional :port part.
A signed script request expanded privileges, gaining access to restricted information. It requests these privileges by using LiveConnect and the new Java classes referred to as the Java Capabilities API. These classes add facilities to and refine the control provided by the standard Java SecurityManager class. These classes can be used to exercise fine-grained control over activities beyond the "sandbox"—the Java term for the defined limits within which Java code must otherwise operate.
Virtually any programming language can be used nowadays to create applications for the Web. ActiveX for example is defined language-neutral. CGI is also a language-neutral interface definition. However, it gets a little bit more difficult, if platform independence is necessary (this is often needed, if applications are executed on the client side without knowing the client’s hardware and software environment). Binary ActiveX controls can be an efficient way if the environment is known or defined like in the Intranet of an organisation.
The second language is Java, introduced by Sun Corp. Java runs in a platform-independent environment called Java Virtual Machine (JVM).
This core language corresponds to ECMA-262, the scripting language standardised by the European standards body, with some additions [ECMA97].
Java, introduced by Sun [Sun97], is a simple object-oriented programming language as close to C++ as possible. It has automatic garbage collection, thereby simplifying the task of Java programming. Java has an extensive library of routines for coping easily with TCP/IP protocols like HTTP and FTP, which is important for a language developed for the Internet. This makes creating network connections much easier than in C or C++. Like C++, Java is a strongly typed language. One of the advantages of this approach is that it allows extensive compile-time checking so bugs can be found early. Java has a pointer model that eliminates the possibility of overwriting memory and corrupting data. Java is multi-threaded and has a sophisticated set of synchronisation primitives that are based on the widely-used monitor and condition variable paradigm.
Java was designed to support applications on networks. In general, networks are composed of a variety of systems with a variety of hardware and operating system architectures. To enable a Java application to be executed anywhere on the network, the compiler generates an architecture-neutral object file format--the compiled code is executable on many processors, given the presence of the Java runtime system. Java bytecodes are translated on the fly to native machine instructions (interpreted).
Java (as an applet) can not interact with the document; it can’t even resize itself in the document. Java is used for more complicated tasks like establishing an additional TCP/IP connection to another server for retrieving information from a database or a directory server.
Started as an applet, Java is very restricted. Netscape implemented the so-called Java Sandbox security feature. That means, that Java can’t write from or to the local hard disk or just read certain user environment information. Java is even restricted to be just allowed to make TCP/IP connections to the address where it is loaded. However, since version 1.1.x, Java supports a "signed applets" feature, that allows an applet to "leave" its sandbox and to perform specific tasks outside the sandbox if permitted.
The Web is also changing the way applications are created. Traditional hand coding of applications simply takes too long and costs too much for application developers to stay competitive. Modularity and reusability of software components are becoming more important.
A software component is not the same as an object, although components use the object programming encapsulation capability to create self-contained units of reusable code. The most important thing about components is that they provide the ability to reuse hunks of code. A component architecture allows breaking applications into smaller pieces. These smaller pieces can be recombined to build other applications again and again. Therefore, a software component model is a specification for how to develop reusable software components and how these component objects can communicate with each other when they are combined into a bigger application.
With the introduction of Java and especially JavaBeans, a strong move towards Rapid Application Development (RAD) can be observed. Users of software components should be able to simply hook together existing components in a visual development environment ([Hughes97]). To be suitable for these new graphical application development tools the component models need to support other capabilities as well.
- The component must be able to describe itself: The programmer needs to know which properties there are to customise this component, which events can happen. The programmer (or the RAD tool) needs to know the interface of this component.
- The component must allow graphical editing of its properties: A RAD environment is intensely graphical. The configuration of the software component is done almost exclusively through control panels that expose accessible properties.
- The component must be directly customisable from a programming language: This feature allows software components to be manipulated in non-visual development projects, from a code level.
- The component must provide some mechanism that lets programmers semantically link components: Some components create events and others consume them. That way, components can be linked together without programming (which is essential in graphical development environments). A button component for example has an event source for button clicks and a customisation interface for de- and reactivation. The button component can be connected to a "file save" component in such a way that the button click event source is (visually) connected to the save trigger of the "file save" component.
JavaBeans is the component framework for Java programs [Hamil96]. The beans specification, however, is more of a set of programming rules and practices and contains very little in terms of class definitions [Renshaw]. The creation of a bean involves not so much additional steps but more the usage of available classes and interfaces in a particular way combined with some design and naming rules.
JavaBean components support auto-description of their interface through an introspection mechanism. A JavaBean can be studied to determine its properties and events in one of two ways: through the Reflection API or through the BeanInfo class [Hughes97].
The introspection mechanism uses the BeanInfo class if it is supplied. The various descriptor classes that are returned by this interface are then used to determine the Bean's features.
If this BeanInfo class is not available, the behaviour of the Bean can be determined automatically from its method names (like getMethod or setMethod) if the Bean author follows the JavaBean design pattern specification.
The Rapid Application Development environment inspects the interface of the component and provides a property sheet for the user to configure the Bean. Alternatively, Bean designers can include their own customisation class to provide a configuration wizard for the user.
Different Beans are linked together through events. Every Bean declares the events that can be fired. When a Bean's application is assembled, the programmer registers listeners for these events. When an event occurs, all of the registered listeners are notified [Hughes97].
JavaBeans provide programmer security (protects the programmer from incorrect type casting or illegal memory accesses) and resource-level security for the client (sandbox, digitally signed applets from trusted sources can request extended resource access).
- Links to Remote States
The three primary network access mechanisms that are available to Java Beans developers on all Java platforms are [Hamil96]:
- Java RMI: The Java Remote Method Invocation facility makes it very easy to develop distributed Java applications. The distributed system interfaces can be designed in Java and clients and servers can be implemented against those interfaces. Java RMI calls will be automatically and (almost) transparently delivered from the client to the server.
- Java IDL: The Java IDL system implements the industry standard OMG CORBA distributed object model. All the system interfaces are defined in the CORBA IDL interface definition language. Java stubs can be generated from these IDL interfaces, allowing Java Bean clients to call into IDL servers, and vice versa. CORBA IDL provides a multi-language, multi-vendor distributed computing environment, and the use of Java IDL allows Java Bean clients to talk to both Java IDL servers and other non-Java IDL servers.
- JDBC: The Java database API, JDBC, allows Java Bean components access to SQL databases. This database can be either on the same machine as the client, or on a remote database server. Individual Java Beans can be written to provide tailored access to particular database tables.
ActiveX is Microsoft's component object model for the Web environment and uses DCOM (Distributed Component Object Model) for transparent communication and message passing between objects (also called controls). It is an OLE-based component system that allows a piece of code—an ActiveX-control—to be downloaded to a Web page and run locally. ActiveX doesn’t define a specific programming language, so Java can actually be used to program an ActiveX control. However, one of the main purposes of ActiveX is to allow developers to reuse OLE-compatible code.
With ActiveX, controls can communicate with each other, regardless of their implementation language and platform. This is not to say that a given ActiveX control binary is portable; rather it means that the specification allows for communication between controls on different platforms that support ActiveX [Hughes97].
ActiveX controls describe themselves through a binary description file that lists the properties, methods, and events that should be exposed. Usually this interface is automatically created through wizards that guide the user through the process step by step. So the creation of the interface description is more complicated than with JavaBeans but it is highly automated by Microsoft's development tools.
Like JavaBeans, ActiveX controls use events to link controls together.
ActiveX relies solely on code signing to ensure security. The idea is that unsigned code will never be run, and that signed code will run only when the user agrees to run it, given the cryptographically safe signature indicating the code’s origins. However, once the user accepts an ActiveX signature, the ActiveX control runs completely unchecked, with no additional security features [Adida97]. That way ActiveX controls can take advantage of native APIs and programs. This makes ActiveX an extremely powerful technology, which has to be used carefully.
The Web has placed in our hands the potential to communicate with anyone, anywhere. Fully realising this potential depends on widespread use of standards. These standards allow a page to be created once, yet displayed at different times by many receivers.
Although visual and user interface standards are necessary layers, they are insufficient for representing and managing data. The Internet must go beyond setting an information access and display standard. It must set an information understanding standard [Micro97].
Most documents on the Web are stored and transmitted in HTML, which is a simple language well suited for hypertext, multimedia, and the display of small and reasonably simple documents. HTML is based on SGML (Standard Generalised Markup Language), but has hardwired just a small set of tags, leaving the language specification out of the document for the ease of usage.
But this ease comes at the cost of severely limiting HTML in several important respects [Bosak97]:
- Extensibility: HTML does not allow users to specify their own tags or attributes in order to parameterise or otherwise semantically qualify their data. The addition of new tags causes long standardising procedures.
- Structure: HTML does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies for efficient data exchange.
- Validation: HTML does not support the kind of language specification that allows consuming applications to check data for structural validity of the document.
- Static views: HTML is a static language. The change of views of data causes additional document turnarounds instead of reorganising the information on the client side. This is an additional burden for the Web server and the network.
XML can encode the representation for [Micro97]:
- An ordinary document
- A structured (database) record, such as an appointment record
- An object, with data and methods (e.g. a Java object or an ActiveX control)
- Meta-content about a Web site
- Graphical presentation (such as an application’s user interface) ...
XML could be used for applications that require the Web client to present different views of the same information. The Web server delivers just the information. When the client wants to re-sort (telephone list of a company) or hide part of the information (switching between a view of a document with and without annotations) the client computer does the work. Documents can carry messages in multiple languages and the user decides which one to display. The server and the network are not burdened anymore.
Because the client browser understands the structure of the information, the server does not need to be burdened with queries for relevant information. The information can be downloaded to the client and processed locally. Software agents and Web crawlers can perform precise queries for information, because of additional meta-information.
These are just some examples of possible applications. XML should make structure and properties of information visible and automatic analysis possible. The workload should shift from the server to the client side, which increases the scalability of the server and decreases the amount of data to send across the network.
Functionality of an Asynchronous Conference System
In this chapter we discuss one particular important aspect of CSCW: Asynchronous conferencing in more detail.
- Native news server in combination with native news clients are faster, but they usually store some information locally (already read news, downloaded news headers...). This information must be accessible, if the user moves to another computer. One solution is to store the information on a network drive.
- The functionality of the Server and client needs to be extended by programs. This is an overhead when creating a discussion forum.
- One uniform Web interface for browsing the Web and the discussion server.
- Thin client: No additional software needs to be installed on the client side. If applications are used, they can be downloaded from the server on demand. However, common software packages like Netscape’s Communicator or Microsoft’s Internet Explorer already come with a news client.
- Flexibility: The functionality of a commercial news client cannot be changed.
- HTML text format: Web interfaces use HTML by default. This HTML text format can be used to include pictures, forms, links and even programs. That way e.g. application forms can be sent electronically using the rich HTML text format. Some news servers and clients already use HTML (like Netscape Collaboration server and Communicator), but there is still no standardisation. Therefore, this rich format needs to be converted to plain text if being sent to an unknown client.
- Functionality of a Full-Featured WWW-Based Conference System
- Full support of a rich text format (HTML)
- Searchable Meta Information
- General Meta Information
- Document Type
There are already powerful tools for creating Web pages e.g. Netscape Composer (which is part of Netscape Communicator) or Microsoft WinWord which provides converters to create HTML pages. It is easier for clients, that no other formats are introduced but that email programs just use this already widely accepted standard.
Header information like author, title and date of creation is stored as searchable meta information.
The document type like question, answer, and comment is stored as meta information as well. This information can be used to display them differently when showing the hierarchy of documents.
- Freewords: important keywords which are not in the list of keywords (see chapter 8.6).
- Listwords: important keywords which are in the list of keywords (see chapter 8.6).
- Hierarchy of Documents
- List of Keywords (LoK)
The hierarchy can be compared with a UNIX file system. That way an easy and intuitive way to display the hierarchy on the client side should be easily found.
The hierarchy of LoK (List of Keywords, see chapter 8.6) represents the folder hierarchy. Every document is an object of its own with a unique key. Every document is linked to at least one folder: Either to one or more folders of the Listwords specified in the document (see 22.214.171.124) or to a general folder if no Listword is specified. If a new document is created while browsing the hierarchy, the actual folder (keyword) could be the default value.
A document is only visible in a folder if it has the folder keyword as Listword.
This could be an example hierarchy for a medical news group. Document 1 could be an article about the influence of mouth diseases on the growth of toe-nails. Document 2 could be a question on a mentioned tongue disease.
Although Document 1a is referencing Document 1, it is not displayed in the hierarchy of the "little toe" folder. But it would make sense to make a link from Document 1 to Document 1a.
It is possible that a document references another one without being in the same hierarchy if these documents don’t share a common Listword. This should not happen too often, because usually the referencing document is of the same or similar topic.
This makes it necessary to not only provide an Explorer style browser of the hierarchy for the client but also a list of referencing articles for the actual document.
LoK is a hierarchy. New words can be added and removed on demand. This is a powerful tool for the maintenance of the forum at the same time. It can be used to adjust the depth and width of the hierarchy.
This is an example hierarchy that is used in the following chapters to show how the hierarchy is changing when different actions are applied to it.
Three types of action are possible.
There are two reasons to add a new Listword:
- Either there is a new area of topics, e.g.:
- Not only physical but also mental diseases should be discussed.
- There are a lot of documents in the general folder about accidents in one’s spare time.
- Or an existing topic needs to be refined, because too many documents are posted to that folder, e.g.:
- The keywords "Tongue" and "Tooth" are added to the hierarchy of "Mouth"
A new Listword is applied to the bottom level if it is about a totally new topic. If e.g. there are more and more articles in the "General" folder about accidents in one’s spare time, a new folder "Spare Time" can be created and the articles moved into this folder. That way the number of documents in the General-folder can be kept small.
Document 4 is moved from folder "General" to folder "Spare Time" because it deals with accidents in one’s spare time. To do this, the document’s keyword "Spare Time" is converted from Freeword to Listword.
If there are already too many documents in one folder (e.g. "Mouth"), because the topic is too general, the hierarchy can be extended by adding additional folders to the next level of hierarchy.
Because too many documents are in folder "Mouth", the two new Listwords "Tongue" and "Tooth" are added and documents dealing with these topics are moved to these folders. To remove the folders from the main folder "Mouth" the Listword "Mouth" is removed from the documents.
Folders need to be moved in the hierarchy if the hierarchy needs to be adjusted. If e.g. the number of sub-folders increase, an additional level could be added to improve the usability.
If the number of sub-folders is not so high, but the user has to go through too many levels to get the desired article, browsing can be too time-consuming or the possibility of taking the wrong turn-off increases.
Basically this adaptation of the hierarchy can be achieved by combining the operation of adding (chapter 8.6.1) and removing (chapter 8.6.3) Listwords.
To keep the number of Listwords small, Listwords should be removed if the interest in that topic is not so big. The documents can be moved into a more general folder (or even into the "General" folder).
This is a modification of the hierarchy in chapter 126.96.36.199. The Listword "Tooth" was removed because there was just one document in this folder. The document was moved back to the next lower folder "Mouth" and "Tooth" was changed from Listword to Freeword.
If the folder that needs to be removed is already in the lowest level, the document is moved to the general folder.
The Listwords can be used to build up a folder hierarchy that can be browsed like file folders. The client can browse through the hierarchy to search for interesting topics. This is straightforward, because most clients are used to this functionality. Most GUI-based file managers use some kind of Explorer (e.g. MS Explorer) to browse through the folder hierarchy of a file system.
Additional information can be used to indicate "hot" areas, e.g.:
- Number of documents in a folder.
- Number of new documents in a folder compared to the number of new documents in the whole hierarchy (see chapter 8.7.4)...
In most discussion forums the output of a query is a list of documents. This is fine in a discussion forum with few documents. However, in large forums this can be confusing. The Listword hierarchy on the other hand is an easy way to help the user to understand the hierarchy’s structure of information.
Therefore, two display modes should be implemented:
- Simple list: This mode can be used by default, if only few documents were found as query result.
- Hierarchical view: If the query was not precise enough and many documents were found, the documents can be structured in a hierarchy (again using the Listwords and document dependencies).
It should be possible for the user to switch between these two display modes. Maybe the system can automatically decide about the initial display mode depending on the number of documents found.
Another feature is to select a document from the search result and to change back to standard browsing mode, so that all the other documents in the folder of the selected document are shown as well. That way, the user can look for similar documents within the same area.
Searchable meta information decreases the search time compared to full-text searches and the query result is more precise.
Especially interesting is the document’s date and time of creation. This information can be used to search for new documents, e.g. documents created last week or documents created since the user’s last visit of the discussion forum. It would be user friendly to store the date of the user’s last visit so that the user does not need to key in the information himself. This could be done with cookies, which has the disadvantage that the information is stored on the client’s computer. If he uses another computer, the information is no longer available. Another way would be to store it on the server side, if the user’s account information (e.g. username and password or a digital signature) is available.
Full-text search is a feature supported by many new Web servers. However, most servers only allow you to search the whole server or predefined areas.
To improve the accuracy it would be helpful to be able to dynamically restrict the query to a certain area of the hierarchy. That way the browsing functionality and the full-text query can be combined. The user browses through the hierarchy to find an interesting branch and performs a full-text search for relevant documents in that hierarchy.
One way to search for relevant information is to find documents with the same or similar Listwords and Freewords. That way documents covering the same topics can be searched.
Another possibility of calculating the similarity of documents could be to create a matrix of all documents and all words in these documents (that is usually done with full-text indices). That way n-dimensional vectors can be created and the distance or angle to other vectors can be calculated. That way the similarity of documents can be calculated.
This similarity can be used in two ways:
- Searching for documents which are similar to the actual document if the current document is interesting.
- Searching for similar documents when inserting a new document. That way documents which can be referenced can be searched. This could help people to integrate the document into an already existing hierarchy. The server could e.g. start searching for similar documents while the user keys in his document. If an interesting document is found the user could browse that document and reference it if relevant.
If the server searches for similar documents across the boundaries of the hierarchy the discussion would become more inter-disciplinary.
Statistical information about the discussion forum could be:
- The number of new documents in that part of the hierarchy compared to the total number of new documents. This feature could be for people not searching for something special but for people who are interested in "hot" topics in general.
- The number of existing documents in that part of the hierarchy compared to the total number of existing documents. This information could also be used for maintenance reasons. If the number of documents in a folder rises above or stays below a certain level the hierarchy needs to be adjusted by splitting the hierarchy or by removing some folders, because too many or too few people are interested in these topics.
- Further statistical information could be the number of accesses to that file in a certain period of time. Again, this information could be used to search for "hot" topics or for maintenance reasons. If nobody is interested in this document it could be removed or moved to a backup folder after some time. That way it can be quite easily assured that only interesting topics are kept in the forum. This keeps the number of documents low and eases the search for relevant information and maintenance.
As discussed in chapter 8.7.4 statistical information helps to maintain the discussion forum. The hierarchy needs to be adjusted to the number of documents in the folder (see chapter 8.6). Old documents which nobody is interested in need to be removed or moved to a "backup area".
If desired, new articles can be read and verified before making them visible to the users of the forum. The date of insertion can be used to find newly inserted articles.
It has been shown in sections 2.2 and 7.1 that the Internet is an interesting platform for groupware. The Internet is a marketplace with millions of customers and an information storage with thousands of millions of documents. Millions of people are permanently connected to the Internet and there are widely accepted protocols for asynchronous (used by email, discussion groups, etc) and synchronous (used by video conferencing tools, whiteboards, etc) communication.
The interest for the new area of groupware is growing fast as well. Groupware products are also on the market. One problem is still to support efficient coordination of tasks. The evaluation of awareness features is difficult. The social aspects of groupware impose a challenge to the application creation. The introduction of new CSCW applications needs to be carefully planned.
Commonly used protocols are not established yet for automatic information interchange of groupware information (document routing information, awareness activities, etc) and connection of different groupware systems (especially workflow systems).
Another problem is the need of flexibility in groupware. Rapid Application Development and component models could help people to create their own groupware, reusing software from repositories and standardised libraries. That way software could be adapted to changing needs.
[Abbott94] Kenneth R. Abbott, Sunil K. Sarin, "Experiences with Workflow Management: Issues for the Next Generation", © 1994 ACM 0-89791-689-1/94/0010, http://www.acm.org
[Acker96] Ackerman M., McDonald D., "Answer Garden 2: Merging Organizational Memory with Collaborative Help", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Adida97] Adida B., "Java, more than a revolution", IEEE INTERNET COMPUTING, http://computer.org/internet/
[Andrew94] Andrews K., Kappe F., Maurer H., Schmaranz K., "On Second Generation Hypermedia Systems", J.UCS 0(0)
[Basher96] Basher M. A., "Achieving Dynamism in Second Generation Hypermedia Systems", diploma thesis, Universiti Malaysia Sarawak
[Bentley94] Bentley R., Rodden T., Sawyer P., Sommerville I., "Architectural Support for Cooperative Multiuser Interfaces", IEEE Computer Society Press, 1993, http://computer.org/internet/
[Bosak97] Bosak J., "XML, Java, and the future of the Web", http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm
[Chambe97] Chamberlain E., Mitchell M., "Lesson 2: History Of The Net", © 1997, the Board of Trustees of the University of South Carolina, http://web.csd.sc.edu/bck2skol/fall/lesson2.html
[Dourish92] Dourish P., Bly S., "Portholes: Supporting awareness in a distributed work group", Proc. ACM Conf. On Human Factors in Computing Systems (INTERCHI’92), http://www.acm.org
[Dourish96] Dourish P., Holmes J., MacLean A., Marqvardsen P., Zbyslaw A., "Freeflow: Mediating Between Representation and Action in Workflow Systems", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Elgam96] Elgamal T., Treuhaft J., Chen F., "Securing Communications on the Intranet and over the Internet", Netscape Communications Corporation, http://www.netscape.com
[Fahlen93] Fahlen L. E., Stahl O., Brown C. G., Christer Carlsson, "A Space Based Model for User Interaction in Shared Synthetic Environments", © 1993 ACM 0-89791-575-5/93/0004/0043, http://www.acm.org
[Fielding97] Fielding R. T., Kaiser G., "The Appache HTTP Server Project", IEEE Internet Computing, July-August 1997, http://computer.org/internet/
[Fitz96] Fitzpatrick G., Kaplan S., Mansfield T., "Physical Spaces, Virtual Places and Social Worlds, A Study of work in the virtual", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Gall] Gall U., Hauck F. J., "Promodia: A Java-Based Framework for Real-Time Group Communication in the Web", http://www6.nttlabs.com/papers/PAPER100/PAPER100.html
[Greenb91] Greenberg S., "Computer-supported Cooperative Work and Groupware", London, Academic Press Ltd, 1991
[Greenb95] Greenberg S., Gutwin C., Cockburn A., "Sharing Fisheye Views in Relaxed-WYSIWIS Groupware", Proceedings of Graphics Interface
[Greenb96] Greenberg S., "Peepholes: Low Cost Awareness of One’s Community", Position Paper for the ACM CHI’96, http://www.acm.org/sigs/sigchi/chi96/proceedings/shortpap/Greenberg2/sg1txt.htm
[Greenb97] Greenberg S., Johnson B., "Studying Awareness in Contact Facilitation", Position Paper for the ACM CHI’97 Workshop on Awareness in Collaborative Systems
[Grudin] Grudin J., "Groupware and Social Dynamics: Eight Challenges For Developer", http://www.ics.uci.edu/~grudin/Papers/CACM94/cacm94.9html
[Gudiva97] Gudivada V. N., Raghavan V. V., Grosky W. I., Kasanagottu R., "Information Retrieval on the World Wide Web", 1089 - 7801/97/$10.00 ©1997 IEEE, http://computer.org/internet/
[Guha97] Guha R.V., Bray T., "Meta Content Framework Using XML", http://www.w3.org:80/TR/NOTE-MCF-XML-970606
[Gutwin95] Gutwin C., Stark G., Greenberg S., "Support for Workspace Awareness in Educational Groupware", http://www-cscl95.indiana.edu/cscl95/gutwin.html
[Gutwin96] Gutwin C., Roseman M., Greenberg S., "A Usability Study of Awareness Widgets in a Shared Workspace Groupware System", Research Report 96-585-05, Department of Computer Science, University of Calgary, http://cpsc.ucalgary.ca/grouplab/papers/
[Gutwin96b] Gutwin C., Roseman M., Greenberg S., "Workspace Awareness in Real-Time Distributed Groupware: Framework, Widgets, and Evaluation", Proceedings of the People and Computers XI HCI’96
[Gutwin97] Gutwin C., Greenberg S., "Workspace Awareness", Position Paper for the ACM CHI’97 Workshop on Awareness in Collaborative Systems
[Gutwin97b] Gutwin C., Greenberg S., "Effects of Awareness Support on Groupware Usability", Research Report 97-605-07, Department of Computer Science, University of Calgary, http://cpsc.ucalgary.ca/grouplab/papers/
[Hall96] Hall R. W., Mathur A., Jahanian F., Prakash A., Rassmussen C., "Corona: A Communication Service for Scalable, Reliable Group Collaboration System", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Hamil96] Hamilton G., "JavaBeans", © 1996, 1997 by Sun Microsystems Inc., http://java.sun.com/beans
[Hughes97] Hughes M., "JavaBeans and ActiveX go head to head", http://www.javaworld.com/javaworld/jw-03-1997/jw-03-avb-tech.html
[Hyper97] "Hyperwave User’s Handbook", © Hyperwave Information Management GmbH
[Jeff96] Han J., Smith B., "CU-SeeMe VR Immersive Desktop Teleconferencing", © 1996 ACM 0-89791-871-1/96/11, http://www.acm.org
[Jensen] Jensen P., Soparkar N., "Real-Time Concurrency Control in Groupware"
[Kappe97] Kappe F., Pani G., "Hyper-G Client-Server Protocol"
[Kindb] Kindberg T., Coulouris G., Dollimore J., Heikkinen J., "The Mushroom Project: Creating Collaborative Spaces", Department of Computer Science Queen Mary and Westfield College, University of London
[Kiniry97] Kiniry J., Zimmerman D., "A Hands-on Look at Java Mobile Agents" 1089 - 7801/97/$10.00©1997 IEEE, http://computer.org/internet/
[Koulop] Koulopoulos T. M., "The Workflow Imperative, building real world business solutions", ISBN 0-442-01975-0, ITP Inc.
[Krulw97] Krulwich B., "Automating the Internet: Agents as User Surrogates", 1089 - 7801/97/$10.00 ©1997 IEEE, http://computer.org/internet/
[Lee96] Lee A., Girgensohn A., Schlueter K., "Challanges in Deploying a Video Awareness Tool in the Workspace", Position Paper for the ACM CHI’97 Basic Research Symposium, http://www.cs.utoronto.ca/~alee/brs96.html
[Leiner97] Leiner B. M., Cerf V. G., Clark D. D., Kahn R. E., Kleinrock L., Lynch D. C., Postel J., Roberts L. G., Wolff S., "A Brief History of the Internet", http://www.isoc.org/internet-history/
[Ly97] LY E., "Distributed Java applets for project management on the Web", 1089 - 7801/97/$10.00 ©1997 IEEE, http://computer.org/internet/
[Mark96] Mark G., Haake J. M., Streitz M. A., "Hypermedia Structures and the Division of Labor in Meeting Room Collaboration", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Mathur95] Mathur A. G., Hall, R. W., Jahanian F., Prakash A., Rasmussen C., "The Publish/Subscribe Paradigm for Scalable Group Collaboration Systems", Department of Electrical Engineering and Computer Sience, University of Michigan
[Maurer96] Maurer H., "Hyperwave: the Next Generation Web Solution", © Addison Wesley Longman 1996, ISBN 0-201-40346-3
[Maurer97] Maurer H., "Multimedia Repositories and the LIBERATION Project", http://www.iicm.edu/liberation
[McGr97] McGrew T., "Collaborative Intelligence: The Internet Chess Club on Game 2 of Kasparov vs. Deep Blue", © 1997 Institute of Electrical and Electronics Engineers, Inc.
[Micro97] "XML White Paper", © 1997 Microsoft Corporation, http://www.microsoft.com/xml/xmlwhite.htm
[Mitche95] Mitchell A., "Concurrency Control in CSCW Systems", http://www.dgp.toronto.edu/~alex/unpublished/concurrency.html
[Molnar97] Molnar A. R., "Computers in Education: A Brief History", The Journal Online, http://www.thejournal.com/SPECIAL/25thani/0697feat02.html
[Munson96] Munson J., Dewan P., "A Concurrency Control Framework for Collaborative Systems", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[NCSA97] "The Common Gateway Interface", http://hoohoo.ncsa.uiuc.edu/cgi/
[NCSA97a] "The Common Gateway Interface", http://hoohoo.ncsa.uiuc.edu/cgi/primer.html
[Orchar97] Orchard D., "Java 1997: A detailed look at where Java’s going this year and in the near future", http://www.javaworld.com/javaworld/jw-06-1997/jw-06-javafuture.html
[Pam95] Pam A., Vermeer A., "A Comparison of WWW and Hyper-G", J.UCS 1(11)
[Patter1] Patterson J. F. et al, "The Notification Service Transfer Protocol (NSTP): Infrastructure for Synchronous Groupware", http://www.lotus.com/research/21ca/www6paper.html
[Patter2] Patterson J. F. et al, "Notification Servers for Synchronous Groupware", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Patter96] Patterson J. F., Day M., Kucan J., "Notification Servers for Synchronous Groupware", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Poltro95] Poltrock S., Grudin J., "Groupware and Workflow: A survey of systems and behavioral issues", © 1995 ACM 0-89791-755-3/95/0005, http://www.acm.org
[Prinz] Prinz W., Syri A., "Two complementary tools for the cooperation in a ministerial environment", German National Research Center for Information Technology, Germany
[Prinz96] Prinz W., Kolvenbach S., "Support for Workflow in a Ministerial Environment", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Renshaw] Renshaw D., "JavaBeans: The Perfect Roast", http://www.ibm.com
[Resnick] Resnick P., "Filtering Information on the Internet", http://www.sciam.com/0397issue/0397resnick.html
[Resnick96] Resnick P., Miller J., "PICS: Internet Access Controls Without Censorship", © Copyright 1996 by ACM, Inc. (note 1), http://www.w3.org:80/PICS/iacwcv2.htm
[Rosema96] Roseman M., Greenberg S., "TeamRooms: Network Places for Collaboration", © 1996 ACM 0-89791-765-0/96/11, http://www.acm.org
[Schlichter97] Schlichter J., Koch M., Buerger M., "Workspace Awareness for Distributed Teams", Proceedings of Workshop Coordination Technology for Collaborative Applications, Singapore
[Shoffn97] Shoffner M., "JavaBeans vs. ActiveX: Strategic analysis", http://www.javaworld.com/javaworld/jw-02-1997/jw-02-activex-beans.html
[Simon94] Simon R., Sclabassi R., Znati T., "Communication Control in Computer Supported Cooperative Work Systems", © 1994 ACM 0-89791-689-1/94/0010, http://www.acm.org
[Sun97] "Java White Papers", © 1995-97 Sun Microsystems Inc., http://java.sun.com/docs/white/index.html
[Whit96] Whitaker R., "Computer Supported Cooperative Work (CSCW) and GroupWare", © 1989, 1992, 1996. http://www.informatik.umu.se/~rwhit/CSCW.html
[Wulf97] Wulf V., "Konfliktmanagement bei Groupware", Vieweg, ISBN 3-528-05576-6, 1997
[Zhao96] Zhao J., Koch E., "A Digital Watermarking System for Multimedia Copyright Protection", © 1996 ACM 0-89791-871-1/96/11, http://www.acm.org
[Zuse96] "The History Of Hyper-G", © by Konrad-Zuse-Zentrum für Informationstechnik Berlin (ZIB), http://elib.zib.de/hyperg/about/history