SimpleDB
Started playing around with Amazon's DB. It's still in beta and there are a few niggly restrictions, but it is usable and "does what it says on the tin". It's a simple cloud-based storage mechanism for storing single rows of arbitrary data. The emphasis is on simple - single table stuff. So this is not going to be a replacement for your well structured database. However, one of the things that you notice when your system starts getting a lot of traffic is that there is an awful lot of data that is simply flowing over your servers that you would really like to store, report on and build some new cool things with. If only you had somewhere to collate and store this information. Enter Simp[leDB. Sure I could invest time and energy in bringing a data warehouse on line (if I could get it into the production setup) - which would certainly address some of simpleDB's usage restrictions, but this sort of task is exploratory. I want the freedom to start instrumenting and collecting data basically on a whim without incurring significant setup costs. Some stats will definitely be useful, plugging holes for current decision-making, others are more of the "suck-it-n-see" variety. Using simple DB solves two basic problems for me over just using one of the (many) mysql db's available to me:

- There's an overhead with storing information in a production database. It needs to be properly setup in terms of disaster recovery, this needs to be tested. It needs to have a pruning and archiving schedule negotiated with the support people. Then there's the testing. This is a lot of pro-forma setup stuff which really is not what I need to have to invest in upfront. Sure if this thing takes off or outgrows SimpleDB's restrictions, then I can revisit this, but for now I don't want any roadblocks.
- Neatly sidestepping organisational boundaries. We have many environments where I would like to gather data, but if I have to negotiate with each environment for storage, then I'm going nowhere fast, but moving the problem outside the network neatly sidesteps these issues. It also provides a natural place for me to centralise the information, something which for network partitioning issues would also be problematic to gain approval for.


3 Comments:
What "few niggly restrictions"?
Firstly size (max 100 domains, max records per domain). Secondly usage restrictions - no query can take more the 5 seconds. I don't think this would be a problem for me, but what if it eventually does become a problem, what is my option?
Secondly, simpleDB only uses key access. I would prefer certs.
Fergal:
1.) The max number of domains is set to 100 because it is a heavyweight operation. The intention is to stop folks from designing applications with millions of domains. The 100 domain limit can be lifted selectively by contacting Amazon Developer relations.
2.) The ability to run long queries, >5 seconds, was introduced in March with the asynchronous query functionality.
3.) The max records per domain was recently raised from 250mm to 1 billion.
Doesn't the X.509 Certificates support for SOAP meet your requirements?
Cheers,
Charlie
Post a Comment
<< Home