Zaurus/Mac

zXSync - Design and Architecture

Andreas Junghans

Geoff Beier


Table of Contents

1. Introduction
1.1. Overview
1.2. Goals
1.3. Technical realities
1.4. Problem cases
2. Architecture
2.1. Core synchronization engine
2.2. Plug-in architecture
2.3. Command line interface
2.4. GUI interface

1. Introduction

1.1. Overview

The main purpose of zXSync is to provide synchronization capabilities between Sharp Zaurus PDAs and Mac OS X. This is also where the name comes from ("z" for "Zaurus" and "X" for "Mac OS X"). Synchronization means that contacts, calendar entries, and other information from the area of "Personal Information Management" (PIM) can be transferred between the standard applications on the Zaurus and the Mac that are responsible for handling this data. On the Zaurus, these are the built-in Address Book, Todo List, and Calendar applications. On the Mac, these are the built-in Address Book and iCal.

1.2. Goals

The following goals are the driving forces for zXSync's design:

Easy handling

The user should only need to intervene as seldom as possible. However, the user should be allowed to control the synchronization as detailed as he or she wants to.

Reliability

The user should always be made clearly aware before destructive changes are made by zXSync. Destructive means the change will lead to loss of data the user once entered, no matter how small. In addition, regular backups should be made automatically to provide an easy way back when something went wrong.

Extensibility

zXSync should be easily extensible by the original authors or third parties. The most important form of extension is support for additional devices and/or applications.

Flexibility

There should be several ways to interact with zXSync and use it from different environments. A command line tool is necessary for automation of synchronizations and for advanced features not easily made available within a GUI. For standard use, a modern, easy to understand GUI should be provided that requires only a minimum of user interaction but on the other hand allows detailed customization. Finally, the core functionality should be available in a library so it can easily be used by other applications.

Transparency

The user should always be aware of what zXSync is doing at any time and what it will do if the user issues a certain command. Detailed reports of every sync should be available to show exactly which entries have changed in what way.

Support for open standards

zXSync should support open standards where possible and sensible. In particular, vCard, iCalendar, and SyncML should be supported. However, since SyncML is not part of the standard software installed on the Zaurus, it is not a top priority. It is sufficient if SyncML support is available through an extension after basic synchronization with the Zaurus works.

Portability

The core functionality should be available in a library that is easily portable to various systems. However, it should also be lightweight and easily integrateable into applications, so Java is not an option here.

1.3. Technical realities

In an ideal world, every device/application would work in a way that makes it easy to track user changes to data. This way, only the changes would have to be sent when syncing the device and a host computer. Unfortunately, there are a lot of devices that don't have this capability - they are "dumb" when it comes to handling their data. zXSync must be able to deal with this and not assume any possibility to track changes. In other words, it must be capable to deal with "passive" devices that offer their data like a harddisk with a non-journaling file system.

Examples for "dumb" devices are the Zaurus and Apple's iPod. While the Zaurus can be upgraded with software to maintain a change log (which is also required by the more advanced SyncML modes), the iPod cannot. zXSync should work in a way that makes it possible to load the iPod with contacts on any Mac, with no special software installed, and let it later participate in a synchronization.

There's also something to be said about Apple's iSync. There are a lot of reasons for developing zXSync instead of using iSync. The most important ones are mentioned here to make the decision for starting from scratch clear:

  1. There is no SDK or public plug-in API for iSync. It's not clear if iSync is extensible at all without recompiling it.

  2. iSync seems to be based on SyncML, which would require additional software on the Zaurus. While this is no absolute obstacle, it's better if you can sync with it "out of the box".

  3. iSync is quite inflexible (for example, it can only sync to an iPod, not from it), and there are no detailed choices of what should be synced and what not.

  4. iSync is not very transparent. It's not easily possible to see exactly what it did during a sync.

1.4. Problem cases

The following is a collection of problem cases a sync application has to handle.

  • The user entered a phone number for Jane Doe on the Zaurus and an additional eMail address for her on the Mac. What should happen if you sync the two is that Jane's phone number is added on the Mac and her eMail is added on the Z. Thus, syncing must happen on the field rather than the address (or more general: entry) level.

  • The user added and deleted various entries on the Zaurus and the Mac. Since not all devices have something like a last-modified date (the Zaurus hasn't), the user has to decide individually which entries should be copied/deleted. Thus, is must be controlable on the entry or even field level if some information should be read or written (or both or none of them).

  • The user synchronizes between various devices, and some of them have fewer fields than others (for example, some might support 2 telephone numbers while others support an arbitrary number). Syncing from devices with fewer fields should (at least by default) not destroy additional fields on other devices.

2. Architecture

To fulfill the aforementioned goals, the following components are suggested:

  • At the core, a synchronization engine that handles entries in some representation of a standard format (vCard for contacts, iCalendar for calendar entries and todo items). The sync engine must be capable of reading entries from various devices, checking them for additions, deletions, and changes, resolving conflicts, and writing changed and added entries back to the devices. Note that it should be possible to synchronize the Mac with more than one device at a time.

  • A plug-in system for device/application support. It should be easy to add the ability to read from and write to a new device.

  • A command line front-end.

  • A Cocoa GUI (deliberately non-portable to leverage the advantages of the Mac OS X operating system).

2.1. Core synchronization engine

2.1.1. Sync hierarchy

As already mentioned, the user should be able to control in great detail what should be done when syncing. However, general choices must be possible on a broad level to avoid micro-management. For this purpose, a hierarchy structures the information that can be part of a sync. Its levels are as follows:

Device

This is the root of the hierarchy. The host computer is also considered a device.

Application

An application basically groups all entries of a kind. The term is used in an abstract way here. For example "contacts" would be an application in zXSync terms rather than "Address Book" or "ABCSoft Address Manager X".

Entry

An entry is a data record. For example, a contact containing name, address, etc. would be an entry as well as an appointment in the calendar.

Field

A field is part of an entry. For example, a contact can contain a field called "eMail". Fields cannot nest, but the same field can appear several times with different values. [1]

2.1.2. Sync modes

To provide the user with fine control, it should be possible to specify on all hierarchy levels what should happen during a sync. This is specified by one of four sync modes explained below. The sync mode of one element in the hierarchy determines the sync mode of all its children, but this can be overridden. If one child overrides the sync mode of its parent, this new mode in turn is used for the child's children - provided these do not override it again.

The four sync modes are:

Sync none

An element with this sync mode does not participate in a synchronization. It is not propagated to other devices, and it is not changed or deleted.

Sync from

An element with this sync mode is propagated to other devices, but not changed or deleted, even if there is different data for this element on other devices.

Sync to

An element with this sync mode is not propagated to other devices, but it is changed (if altered on another device) or deleted (if removed on another device). Note that "on another device" only refers to elements on these devices that are in "sync from" or "sync both" mode. For example, if an element is changed on device A and marked as "sync none", these changes are guaranteed to not be applied to other devices, no matter what their sync mode is.

Sync both

An element with this sync mode is propagated to other devices. It is changed if it was changed on other devices, but only if this change doesn't destroy any information. For example, adding a field is not considered destructive, but changing a field's value is! Thus, a "sync both" element behaves just like a "sync from" element when being read, but it receives more protection than a "sync to" element when being written. [2]

"Sync both" should be the default since it distributes all information to all devices without being destructive. In most cases, using "sync both" for everything should result in a conflict free "one-click sync".

2.1.3. Id mapping

The sync engine must be able to map ids assigned to the same entries on different devices. This is necessary since many devices use their on id scheme that is not compatible to the ones on other devices. For this reason, the engine must match entries solely based on their data fields on the first sync. However, it can cache the ids of matching entries so this has to be done only once. Note that there must also be a way to change this mapping (or to specify matching entries even before the first sync) since automatic matching may result in mistakes.

2.1.4. Conflict handling

The engine must report conflicts in a way that lets frontends display exactly where the problems are. Conflicts are resolved by overriding the sync mode of conflicting entries. For example, by setting the sync mode to "sync to", an entry can be declared as the "loser" of a conflict. Conflicts can be resolved on every level (e. g. by setting a whole device to "sync to" mode or only a single conflicting entry).

2.2. Plug-in architecture

Plug-ins are simple Python modules that provide a set of functions to communicate with the core. Besides functions, objects of well-defined classes (e. g. "Device") are used for communication. Later on, other languages/platforms for writing plug-ins might be supported.

The plug-in API is described in detail in a separate document.

2.3. Command line interface

Invoking a command line tool for syncing may look like the following:

% zxsync --plugin zaurus --plugin mac --app zaurus:contacts:none --app mac:contacts:both --uid zaurus:contacts[Name: Doe; Firstname: Jane]:123 --uid mac:contacts[Name: Doe]:123 --override zaurus:contacts[uid:123]:to

The --plugin switch makes a certain plug-in part of the sync. The --app switch specifies which exact applications should participate in what sync mode. --uid allows for manual id mapping. The --override switch is for overriding sync modes on arbitrary levels.

2.4. GUI interface

The GUI should make it easy to specify the sync modes on all levels and to resolve conflicts. Here is a simple screenshot to show how the GUI might look like (very early draft):

The popup menu buttons can be used to change the sync mode. If it is changed for one element, all children below it are changed accordingly (provided they had the same mode as the parent before). A marker beside an element should specify if any of its children override the sync mode of the parent (so you can see this even if the elements are not expanded).

In case of conflicts, the conflicting entires should be expanded (i. e. visible in the outline views) and marked in yellow. The color red is reserved for entries that have a mode of "sync to" to indicate that this is a possibly destructive operation. [3]

After pressing "Sync!", all participating elements are checked for conflicts. If no conflicts occur, the sync operation is completed normally. Otherwise the user is prompted to resolve the conflicts, and the "Cancel" button is activated. Resolving the conflicts and pressing "Sync!" a second time continues with syncing while pressing "Cancel" aborts the current sync.



[1] I've removed the possibility for fields to nest which is in line with the vCard standard. It's also easier to handle. If necessary, we can add nesting back later.

[2] Maybe this mode should be called "Sync auto".

[3] Not shown in the screenshot yet since that's not yet implemented in the prototype. Note that this color scheme is just an idea yet - there may be better ways to express the same.