But there are two pieces that bloggers are not paying attention to, or they didn’t realize (after all, most of them are not developers)
I felt the service was so interesting I checked the API and how it works and bam! I was hit in the head.
Did anyone say X.500?
My first realization is that it’s not a Database, it’s a Directory Service!
Ok, most people (even developers) would not know what a Directory Service is even if you hit them with an Active Directory book on the head. Anyway, if I remember correctly of my years on Exchange Server (98-99) while working on the Active Directory integration, a Directory Service had a few peculiarities that differentiated it from a traditional database.
First of, each object (this is what a “record” is called on a Directory Service) can contain different attributes and the schema can be changed on the fly (a bit more complicated than that).
The next interesting aspect is that a single attribute (field) can have multiple values, just like the Amazon SimpleDB! This means if I define attribute “UsedBy” I can set the values to “Realtors” and “Brokers”. On traditional relational databases you’d need 3 tables to do something like this.
Finally, a Directory Service allows a hierarchy of objects, meaning instead of Tables you have nodes (which are container objects) and objects hang out of those nodes. Oh oh, SimpleDB doesn’t have that, so all my theory goes down the drain…. Not really, they provide a thing called “Domain” which, if you want to (but you don’t), can be used as a hierarchy.
And the best application for SimpleDB will be…
Calling SimpleDB a database or a directory service doesn’t change what it can do or what people can do with it, it’s just a convention. What matters are the nice products that will come out of that, and IMHO, one of the most interesting one will be…
A search engine!
What? Somebody will built a search engine on top of SimpleDB to compete with Google? Nah! Somebody — lots of body, actually — will be able to built their own site search service on top of SimpleDB.
Imagine that Redfin is not a gazillion-dollar VC-backed startup. They are just getting started and want to index all listings from MLS to do a kind of search that you cannot do directly to the MLS database. They can put all that data into SimpleDB (the flexible schema is a huge plus) and not have to worry about having Terabytes of data on their own database. Do you know how much it costs in time and money to maintain a Terabyte database? A lot. There is backup, there is perf issues, there is hardware redundancy, etc.
The only thing missing from the SimpleDB API to provide some serious “site search” capability is a way to rank attributes when doing a query.