XML.org
Newsletter, Volume 1, Issue 3
By Moshe Shadmon and Neal Sample - RightOrder,
Inc.
Who Needs
Directories?
Directories are the hub around which virtually all middleware services spin. They have the important task of storing and delivering critical information to people, processes, resources and groups. Having this information in a common storage area means that various distributed users and applications can access a consistent and comprehensive source for critical data. Directories are somewhat different from general databases. Directories are optimized for reads, rather than transactions. They frequently contain institutional and personal information for use by myriad applications. Directories will be among the most critical services offered in future information technology environments.
When should you use choose a directory
implementation
or a full-fledged relational database (RDBMS)? Directories are usually the
right
choice when confronted with hierarchical information such as Human
Resources
systems, UDDI, pervasive computing,
product catalogs, etc. There are times an RDBMS is still important. A
comparison of the features of each shows their respective
strengths.
Relational
Databases |
Directories |
Strongly
typed and structured |
Strongly typed and
structured |
Objects have a
complex
relationship to each other |
Objects are nested in
hierarchies |
Read/write
transaction
performance is critical |
Directory
entries are “read mostly” |
The database is
generally
centralized – expensive to distribute +
query/update |
Can be highly
distributed
- reasonable
cost of distribution and replication |
Schema is completely
user
defined for flexibility |
Fixed “core schema,”
controls directory hierarchy (e.g.,
country/organization/people) Schema for individual
objects is highly extensible |
Can deal with complex
relationships between objects |
Representing
non-hierarchical relationships is expensive |
Good for data analysis and report
generation |
Good
for top-down searches of logical hierarchies |
Relationships
are known to the query processor |
Relationships can be
explored in the query processing. |
These features indicate that some applications not suitable for directories,
especially when there is a need for information linking. Examples of these
applications
are Enterprise Resource Planning (ERP) and accounting systems.
The Lightweight Directory Access Protocol
(LDAP) is currently integrated into many
products,
from mail
directories
to public-key infrastructures to network components. And much more is
coming.
Applications that integrate directories have been successful for many
reasons.
One reason is that a core schema
enables
a common access protocol, thus many applications can leverage the same
data
source. Clients
can have basic directory knowledge “built in.”
However,
LDAP as a central component of directories is somewhat limiting. It’s
supposed
to be an access protocol for directory data, but it demands that input and
output
data conforms to a fairly strict construction. Directories are limited to
some
degree by the prosaic protocol used to access the underlying information.
What
is missing from directories is the fundamentally enabling nature of a
self-describing system.
The
combination of directory services and XML is the final step in creating
directory-enabled applications within web service architectures. XML is a
self-describing language for data of any type. But XML implies much more
than
just the ability to wrap data into a convenient bundle. XML technologies
have
been developed to deal with ragged and incomplete data in a robust manner.
For
instance, validating parsers can make guarantees to the application about
the
elements in a document. Is the question of validity relevant to LDAP?
Perhaps,
but peripherally at best.
The implications of querying and using directory
packaged as XML present new opportunities to applications. No longer do
applications have to rely on even the “core
schema” of a directory to be effective. Directory schemas may change, but
XML
enabled applications are relatively immune to the effects. Also,
applications
written for one directory can be used with another directory by using
common
tools (such as XSLT) to bridge the gap may use. These approaches are
natural to
XML applications, but similar ideas are not part of LDAP.
First
and foremost, directory technology needs to efficiently support the
hierarchal
nature of the data. There should also be support for schema extensibility
and
evolution. Directories should also be scalable. None of these questions
(save
perhaps scalability) are central to LDAP. Directory services have been
driven
by the access protocol for those directories, putting the
proverbial
cart before the horse.
With
a more flexible framework around the directory and driving directory
development, the channel is clear for robust directory applications. For
example, XPath can be used to address parts of an XML document and
provides the
mechanism needed to support hierarchical directory
paths.
Why XML
Directories?
Directory
Services Markup Language (DSML) is a markup language for representing
directory
services in XML. DSML is a key enabler for the next stage of flexible and
robust directory applications. DSML is being established as an open
standard,
so that developers and vendors will be able to adopt it into their
systems. Two
questions remain for DSML:
§
Are there clear
reasons
DSML is superior to LDAP for directories?
§
Even if DSML is a
superior option, is LDAP too entrenched?
By
now, the answer to the first question should be clear. DSML, as a flavor
of
XML, benefits from the both the nature XML’s flexibility and the plethora
of
available tools. On the flexibility side, DSML directories can evolve and
change and grow without impacting significantly impacting dependent
applications. Likewise, applications can maintain broad applicability
because
they accept a flexible input set.
In
terms of tools, there’s no question that developing using a ubiquitous
standard
such as XML has clear benefits. There are myriad tools to choose from, at
all
levels of deployment, for XML application developers. There are XML
libraries
available for every significant platform and language. Because the tools
are
designed for such much larger, more general set of developers, XML is
naturally
better supported than LDAP.
The
second question still remains; is LDAP too entrenched? A quick look at the
founding partners of the DSML initiative includes the major directory
product
vendors. Sun, Novell, IBM, Oracle, and Microsoft are among that founding
group,
which now includes more than 25 members
[http://www.dsml.org/participants.html].
With
strong backers clearly in place, the final question concerns existing
installations of LDAP directories. Will they continue to hobble future
developments because of the cost of abandoning or reengineering them? It
is
doubtful that even the installed base of LDAP servers can ebb the tide of
DSML.
Already there are wrappers to transform LDAP sources into DSML servers.
One example
can be found at [http://www.dsmltools.org/].
XML
is a proven technology that has been used successfully at all tiers of the
business application hierarchy. Likewise, directory services have been
important component in many applications. Current directory components
consist
are frequently LDAP servers, which are not XML-enabled.
DSML
bridges the gap between directory services and XML applications in a
clean,
robust way. DSML for directories, as an alternative to LDAP, enables a new
level of application flexibility and allows mature XML tools and practices
to
add to the value of directory components.
The
coupling of directories and XML imposes new requirements for storage and
retrieval of data. This coupling
clears
the way for new types of applications that will demand larger data volumes
with
better performance and scalability.
Today, LDAP directories thrive with fixed, pre-defined schema. The use of XML will force directories
to
deal with the non-trivial issues of schema evolution and change. All of this suggests that there is a
need
for new technologies to support
In
the next article we will approach these issues by discussing
and