+++ to secure your transactions use the Bitcoin Mixer Service +++

 

Skip to content

Login | Register

start:

Project Sections: Home | HowTo's | Specifications | Tips 'n' Tricks | QA | Downloads | Miscellaneous | Drivers
Miscellaneous Sections: Developer Projects | OpenOffice.org Base 2.0

OpenOffice.org Base 2.0

There has been some confusion recently about the direction of OpenOffice.org Base, and in particular about OpenOffice.org Base 2.0. A lot of rumors have been spread, a lot of questions have been asked repeatedly, suggestions have been made - in short, there was a lot of discussion ongoing in various channels.
This page should serve as the definite source for information about this topic.


Table of content
Here's where we are: OpenOffice.org Base 1.x
Where are we going? OpenOffice.org Base 2.0
A document-centric application
Things that won't change
Self-contained database files
Embedding of data
Requirements
The choices
The decision
Frequently Asked Questions
Is the decision to use HSQLDB final?
Can't we negotiate the license with the HSQLDB project?
But how can we use HSQLDB if we cannot ship it?
Does OpenOffice.org Base 2.0 run without HSQLDB?
Why not SQLite?
Why not Clouldscape?
Why not Sleepy Cat / Berkely DB?
Change log of this document

Here's where we are: OpenOffice.org Base 1.x

As you know, OpenOffice.org Base 1.x enables you to access a variety of databases. This ranges from some text or dBase files on your hard disc, over for instance MySQL and PostgreSQL up to e.g. Oracle and DB2. There's even more than this, since OpenOffice.org is capable of talking to any database which provides and ODBC, JDBC or (on Windows) an ADO driver.

For all those database, OpenOffice.org offers you the usual data manipulation features: You can read and write data in various ways. Most simple, you can use table views. But you can also create your own queries, using the graphical query design. Additionally, you can create own form documents, using OpenOffice.org's powerful form designer.

Furthermore, OpenOffice.org offers data definition features: It allows you, for instance, the creation and modification of tables, or table relationships. For a limited set of databases, there's also a user administration available.

OpenOffice.org also has with - admittedly still limited - reporting capabilities, by that you can use an auto pilot to create report documents.

Accessing databases is done by creating so-called data sources. A data source describes everything needed to access a concrete database (for instance authorization information, and where the database is located). Additionally, several things are associated with a data source, such as user-defined queries, and form and report documents.
Data sources exist relative to an installation only. This means that if you want to access the same database on another machine, or even from within another OpenOffice.org installation on the same machine, you have to manually duplicate the data source. Also, you have to ensure that the data sources are in sync, for instance, if the authorization information changes, it must be adjusted in all installations.

Leaving the pure database world, OpenOffice.org Base provides the other applications - most notable Writer and Calc - with database access. You can use this, for instance, to write mail letters based on your system address book, or to incorporate your database-originated data in spreadsheet documents.

All of the aforementioned features are, in general, independent on the concrete database. When you work with whatever is your pet database, OpenOffice.org Base offers you all its features, provided that the database itself supports them (for instance, you won't be able to design relationships if your database doesn't know about relationships).

Putting aside the pure functionality, there's one thing which attracts attention immediately: The application itself (if you want to call it this way) is pretty hidden. You find nearly all of it's functionality in the Tools|Data sources... and View|Data sources menus. There is no such thing as File|New|Database, and no own dedicated application. Instead, you always need a document of another type (for instance a spreadsheet or text document) to access the funcitonality of OpenOffice.org Base.

As you can see, OpenOffice.org Base is mostly a front-end to existing databases. In fact, that's its main focus. One reason for this is that there are plenty of databases out there - open source or not, good or bad for a particular purpose, mature or not. There is no gain in creating Yet Another Database - it would be a waste of resources only.

All of this is probably not new to you, but well, we'll need it later on smile

Where are we going? OpenOffice.org Base 2.0

A document-centric application

The most notable change in OpenOffice.org 2.0 will be that Base is an own application. First, this implies own document files. So you now have a File|New|Database menu entry, which creates a new "database document", and opens it in a dedicated application window. You can work with this document, and then save to an arbitrary place - just like you would do with a text document.

Side note: If you're interested how this looks like, we encourage you to have a look at the latest developer snapshot. From milestone m50 onwards, the new application is contained. And though there are still some edges (and probably also bugs), you can already use it to get an impression, and give us feedback.

This change will solve quite some problems which users had with OpenOffice.org 1.x. For instance, database documents are not per-installation anymore, like in OpenOffice.org 1.x. Instead, you can easily use the same database document from within different installations. Changes done in one installation are stored in the file, and thus propagated to the other installations transparently.

Note that those database documents will contain everything related to the data source, as far as solely OpenOffice.org is concerned. In particular the following is stored in the documents:

  • Query definitions
  • Form documents
  • Report documents
  • UI informations (for instance the formatting you gave a certain table column)

Things that won't change

With the above said, you may notice that a lot of things won't change in OpenOffice.org Base 2.0. You still can use it as front-end for your pet database. It will collaborate as good or as bad - hopefully the former - with it as OpenOffice.org 1.x did. In fact, we also improved on those collaborations a bit, since this is a permanent process.

In particular, there won't be any need to buy or install foreign products to benefit of OpenOffice.org Base 2.0. Out of the box 2.0 will offer you the very same functionality as 1.x does. If you want, and as a very rough approximation, just think of 2.0 as a new user interface to the same functionality.

That's good news, isn't it?

Self-contained database files

You probably noted that the actual data does not appear in the above list of things contained in the database files. This means that you still don't have a complete file - the "database document" only refers to your real database.

In some cases, this does make sense, for instance when we're talking about a MySQL/PostgreSQL/Oracle database residing on a remote server. Sometimes, however - for instance when your database basically is some dBase files -, users wish to simply have a single file, representing the database as a whole - including the data!

This would offer interesting use cases: You could take your database file, and easily give it to a friend, just as you could with a text document. Your friend could open it, and see all the data, plus the reports, forms, and queries. Making changes/additions, s/he should then send it back to you (by just copying the file), and you could continue working with it.

Doesn't this sound like a wanna-have for an office suite? We think it does smile.

For this reasons, we are currently aiming to provide OpenOffice.org Base 2.0 with the possibility to have self-contained database files. That is, additionally to all the related documents, the actual data should also be part of our database documents.

Note that this embedding would be optional only: You can still have database documents without integration of the data, if you like!

One important decision we made here is that we still won't create an own database engine. A good database engine is a complex piece of software, and as said above, there are enough of them out there, and even enough which are open source. So why bothering with the re-invention of the wheel, if there are enough wheels asking to be used?

Thus, we are looking for such a foreign database, which can be integrated into OpenOffice.org Base, to reach our goal of "complete" database documents. The chapter Embedding of data goes into detail about this.

Embedding of data

Requirements

When chosing a foreign database to integrate with OpenOffice.org, there are several things we need to take into account:

  • File based

    We don't need a client-server database here. If you open your small innocent database file, you do not want to start a local database server for this, do you? Even if it were completely hidden from you by OpenOffice.org Base, it is way too much hassle for various reasons.

    Thus, what we are looking for is a database which is able to store its data in one or more files on your local hard disc, with leaving you the complete control over this file.

  • Open Source

    Of course we won't use a database which is closed, even if it were for free. There's no argueing about this, is it?

    Well, the main technical reason is the following: Remember that we want to store the data in our files. It's of no use if OpenOffice.org creates its database documents, and the foreign database creates own files with the actual data. That's still at least one file too much.

    So, we need to modify the foreign database to (alternatively) store it's data somewhere else - in a place OpenOffice.org does tell it. This, of course, can be done with access to the source code only.

  • Contributing back

    If the database we're looking for is a living project, then there will be new versions at some time. Of course, we want OpenOffice.org users to benefit from those. This implies that the aforementioned modifications we do to it must be contributed back to the project. Note that we consider this essential: It will allow us to simply point the users to a place where they can download the latest versions, and gain all the benfits it has.

  • The time frame

    We aren't too far away from a beta version of OpenOffice.org 2.0, as you can see in the roadmap. As the policy is to not put in larger features too late (to ensure that the final version is as good and stable as possible), we have a limited time frame only. This means that we must be able to finish all modifications in a not too distant future, which imposes some restrictions on the database we can chose, depending on the effort we would have with its integration.

  • Features

    The database to chose should have a certain maturity and feature set, to offer a real value to our users. It doesn't make sense to provide the embedding feature if you can't use it because it doesn't meet your needs.

  • Platform availability

    OpenOffice.org runs on 4 primary platforms (Windows, Linux x86, Solaris x86, Solaris Sparc), and a huge number of (no offense intended) secondary platforms. Any database we embed should offer support for as much of those as possible.

The Choices

There are various database to chose from. The following lists the most often mentioned:

  • HSQLDB

    HSQLDB is fully java-based. It has a pretty rich feature-set, including, for instance, referential integrity. Currently OpenOffice.org seems to have performance problems with certain types of HSQLDB access, which must be resolved before releasing 2.0.

    An advantage of HSQLDB is that it comes, of course, with a JDBC driver, and OpenOffice.org already allows accessing arbitrary databases via JDBC.

    Update (2004-08-31):
    'till end of august '04, the license of HSQLDB featured a sentence that effectively would have prevented OpenOffice.org from shipping with HSQLDB. Luckily, and due to the courtesy of Thomas Mueler, who held the original copyright for parts of the HSQLDB sources, this is not the case anymore. The offending statement (which was "All advertising materials mentioning features or use of this software must display the following acknowledgment: 'This product includes Hypersonic SQL.'") has been removed from the license. Thus, chances are good that we can package HSQLDB together with the OpenOffice.org download set. (Though this still needs an explicit legal review.)

  • SQLite

    SQLite is a powerful, low-overhead database engine written in C. An advantage is it's very generous license. An disadvantage is that there's only a third-party ODBC driver available, which doesn't run on too much of OpenOffice.org's platforms. Without it, accessing SQLite would require a dedicated SQLite SDBC driver.

    Update (2004-08-31):
    In the meantime, there's an alpha version of this driver available. If you like nothing more than bleeding edge technology, try it out, and give us feedback.

  • Cloudscape

    Recently, Cloudscape has been brought into discussion, as IBM donated it to the Apache Software Foundation. Cloudscape also is a 100 percent Java database. However, it currently is not really available, yet (Apache is still evaluating it before the initial public release).

The decision

We decided to go with HSQLDB. That means that for OpenOffice.org 2.0, we will implement a solution which, when HSQLDB is installed, will allow the user to create a OpenOffice.org database document, internally containing a HSQLDB database.

There are drawbacks of this decision, which we name below. However, in our opinion it's the only possible solution, considering all our requirements. See below for more discussion on this.

The clear disadvantages are:

  • HSQLDB requires Java. We are aware of the fact that some OpenOffice.org distributions do not include Java. Even some of the platforms which OpenOffice.org is ported to do not have a recent Java Virtual Machine at all. In those cases, the "embedding feature" will not be available to the user.

On the medium run, those disadvantages may vanish:

  • HSQLDB currently runs fine with Kaffe, and Kaffe can be found in much more Linux distributions than Sun's Java Runtime Environment.

The advantage is that it finally will allow us to provide complete database documents in time. Which is the difference to all other alternatives we looked at ...

Frequently Asked Questions

Is the decision to use HSQLDB final?

It depends. We are currently investing our time into HSQLDB, and the work needed for its integration. If somebody can provide additional resources for another solution, this is appreciated. And certainly, you can bargain with our full support in this case. However, at the moment we want to concentrate on doing something, instead of losing more time with figuring out what to do.

Can't we negotiate the license with the HSQLDB project?

Unfortunately, there's not much room for this. Only parts of the HSQLDB sources bear the unfortunate license, but those parts have been written by authors who do not work on the project anymore. In fact, this code has been inherited from the time when HSQLDB was not an open source project.

The HSQLDB people offered to re-write the code in question, so a newer (and better suitable) license would apply to the whole of the code. Unfortunately, this requires significant work, which they cannot do in their spare time (at least not in the time frame we're talking about). They offered to do this work if we could provide funding for this - which we, unfortunately, are unable to do.

Update (2004-08-31):
This is pointless now. The license of HSQLDB does not feature the single statement which would have prohibited shipping it with OpenOffice.org, anymore.

But how can we use HSQLDB if we cannot ship it?

We can provide a place to download the modified HSQLDB package. As soon as our modifications have made it into the official HSQLDB release, this download place would be the project site itself.

So, we can point people to the download location. In the simplest case, all the have to do is to download a JAR file, and place it in some folder which OpenOffice.org knows about.

Update (2004-08-31):
This is pointless now. The license of HSQLDB does not feature the single statement which would have prohibited shipping it with OpenOffice.org, anymore.

Does OpenOffice.org Base 2.0 run without HSQLDB?

Of course. Everything known from OpenOffice.org 1.x will work as before, it will just look a little bit different. The only thing you'll miss is the feature of embedding the database data into your database documents.

Why not SQLite?

Because we consider the effort for integrating SQLite as too high at the moment. SQLite would probably need a dedicated SDBC driver, which can be quite some work.

Update (2004-08-31):
In the meantime, there's an alpha version of such a driver available.

If you want to work on the SQLite integration - great! Speak up at our developer mailing list, and get every needed support!

You even could do this after OpenOffice.org 2.0 is out. We will provide a infrastructure for embedding data into our database documents, which can be used by arbitrary drivers. So if you write an "Embed SQLite"-driver for OpenOffice.org, you could deploy it in an existing OpenOffice.org 2.0 installation, and it will automatically be recognized. Upon creating a new database document, the user then would have the choice between all database types for which an "embedding" driver is available.

Why not Clouldscape?

Cloudscape is too new. It's even not really released, yet. Instead, it has just recently been donated to the Apache Software Foundation by IBM. So there's a big uncertainity about it at the moment. For instance, we cannot really judge the effort it would take to make a Cloudscape driver embed its data into our database documents, without close examination of the source.

Why not Sleepy Cat / Berkely DB?

Berkely DB is no relational database. It's a low-level backend, which could be used to implement a relational database, but it's not really a database itself. For instance, it does not have a query engine, a query optimizer, and stuff like that which is so difficult about a database engine. Thus, using it would be nearly as expensive and difficult as it would be to write a database from scratch.



Change log of this document

$Log: dba20.html,v $
Revision 1.23  2006/07/12 08:46:52  msc
remove SDBTesttool entry

Revision 1.22  2006/07/10 12:53:04  clu
added qa link

Revision 1.21  2006/06/22 08:19:22  fs
no nifty corners anymore - this looks ugly on some browsers

Revision 1.20  2006/04/25 10:56:09  fs
some common behaviour, implemented via JS

Revision 1.19  2005/02/28 13:39:42  fs
new section 'drivers'

Revision 1.18  2005/02/28 13:35:34  fs
new section 'drivers'

Revision 1.17  2004/08/31 13:11:06  fs
new HSQLDB license - alpha version of an SQLite driver

Revision 1.12 - 1.14 layout
link to alpha version of an SQLite driver

Revision 1.11  2004/08/12 08:46:54  fs
some relaxation of the HSQLDB disadvantages - thanks to Fred Toussi, maintainer of the HSQLDB project

Revision 1.10  2004/08/12 08:25:54  fs
added change log

      

Author: Frank Schönheit.
Version: $Revision: 1.23 $
Last change: $Date: 2006/07/12 08:46:52 $