
Edge DBs Back the Biggest Acts in Business
By Joe McKendrick 12/12/2005
Google is today's rock star of the information age, fielding more than 200 million queries against approximately 10.4 billion items each day through its various services. Google is shaking the IT industry to its foundations with its online, software-as-a-service model and advertising models. It even has Microsoft considering a business makeover.
What kind of databases power Google? The online resource provider makes heavy use of MySQL for its backend functions. What many people don't know is that underneath the covers, supporting many of Google's ancillary services is what typically is considered an "edge" database. Google employs Berkeley DB from Sleepycat Software to manage user ID settings for Google services such as Gmail, Google Alerts, Google Groups, Google Personalized Home-page, Froogle Shopping List, as well as Google Answers and Personalized Search, all of which use a unified sign-on service called Google Accounts.
Edge databases also play a key supporting role at that other Internet superstar, Amazon. Databases from Progress Software's Real Time Division operate as the data caching front end to a phalanx of Oracle systems in the Amazon Marketplace. As part of Amazon's Fast Propagation Architecture, the Progress ObjectStore Enterprise object database delivers accurate, up-to-the-second views of inventory data received for the more than 50,000 individual sellers who use the marketplace to transact business, supporting as many as five million daily page views, with peak loads of 1000-plus page views per second.
Edge databases, and their closely related counterparts, embedded databases, are a stealth phenomenon--users are usually unaware of their role in an application. But for many in this emerging industry, that's just the way they want it to be. "We're in environments where we're always running in a lights-out always-on mode, with no administration whatsoever," Rex Wang, vice president of marketing for Sleepycat Software, told DBTA. "There are more requirements for applications to operate in a completely unattended fashion. This is done by exposing the administration APIs to the programmer, so the person writing the application can do the things that a human DBA would have ordinarily done during runtime."
Growing Stronger
While the size of the highly diverse edge and embedded database market is difficult to track, new results out of Evans Data--which has been tracking the market for a number of years--confirm that usage of embedded databases is on the rise. Nearly a quarter of data managers and developers, 24 percent, report that they are currently deploying applications for embedded databases--up from 18 percent just a year ago. Another 20 percent are looking at implementing edge/embedded databases within the next year.
While applications are diverse, hot markets for embedded or edge databases include telecommunications--ranging from mobile devices to the servers that power the services over the network--to powering data storage devices, Wang said. Another emerging area is service-oriented architectures. "SOA consists of components that operate in a loosely coupled fashion to form a composite application," Wang explained. "Each of these components inevitably have a need for local storage, which is basically the definition of an edge database."
RFID is another booming data management area where edge and embedded databases will play a key role. RFID "is generating enormous streams of data," said Ken Rugg, vice president of data management products at Progress Software's Real Time Division. "There's no way it would be feasible to capture that into the traditional enterprise core databases. You need a way to capture that data at a very high rate of speed. You want to put those data management solutions on devices, very close to the receivers, to be able to capture information and spot trends ahead of the game. One of the exciting things about RFID is analytics can be moved ahead of the transactional systems."
Simplicity
Many organizations require that new database projects are able to demonstrate tangible business value at a low cost and ease of deployment. This is driving the movement to open source databases, but is also making embedded databases an attractive value proposition for many companies too. When are embedded or edge databases a better choice than larger relational databases? The answer from most experts is a firm, unequivocal, "it depends." As a rule of thumb, an embedded or edge database is likely to be more suitable the further away from central IT the data is to be collected and managed. "In terms of database size, the analogy I like to give is a surgeon selecting a scalpel for a procedure," observed Bruce Weed, program director for IBM's Informix unit. "There's a different tool depending on what needs to get done."
However, implementing an edge or embedded database is far less painful than surgery--or deploying a larger relational database. "What customers really want is a single, simple, cost-effective integration platform for the immediate and continuous capture, transformation, and delivery of data from one database to another, regardless of the database type, physical location, or hardware platform," Robert Anderson, analyst with Gartner, told attendees at Pervasive's recent user conference.
As software vendors add new features, software grows too complex and costly for customers to maintain and manage, Anderson said. Integration of data has become a critical because of the massive heterogeneity within organizations and the continued use of legacy applications, including diverse formats and structures. According to Gartner, 50 percent of the implementation cost of packaged application software is spent on integration alone "Embedded databases are designed to be significantly easier to manage than enterprise or full-featured DBMS packages," agreed Matt Zito, chief scientist for GridApp Systems. "The idea is that there is no need to have a full infrastructure to support embedded databases, and consequently there is less management complexity and overhead."
The Evans Data survey confirmed that this low complexity is the main draw with embedded databases. Asked to identify the single greatest reason for working with embedded databases, developers named ease of deployment and use.
Automatically Autonomic
Data center automation or autonomic computing is almost a natural role for embedded or edge databases. In fact, for many enterprises, embedded and edge databases may represent the first steps toward IT automation. "If you think about when a technology evolves, the same rules apply," Dave Nocera, president of Verifichi, told DBTA. "Every time a technology evolves, the human interface becomes simpler, while under the hood, things becomes more complicated." The same trend is occurring with databases, Nocera continued. "Today, a lot of specialists are required for the back-end databases. But when you embed the database, you're taking the complexity away from the operator, and hiding it within the engine."
External factors are driving embedded and edge database adoption as well. For example, compliance mandates are compelling businesses to find self-contained environments in which audit data can safely be stored and centralized. In this area, embeddable configuration management databases (CMDBs) can help track IT assets and audit information. "You're better served with a data warehouse just for data assets," said Nocera. "Because of compliance requirements, a lot of competitive products are coming out with CMDBs built into them."