A few years ago, I read a blog from Uncle Bob that advocated delaying the decision of your database as long as possible. The main motivation is that the database should not be the central hub of your application. Instead, your application should do things. That logic should be the focus of your application and the database should exist to support that. It is something that has resonated with me, but that I’ve never implemented. Whether using SQL Server or MongoDB, the decision always seems to come up early.
This past week I attended a talk at KCDC on graph databases. I had looked (very briefly) at graph databases in the past but didn’t give them much thought. On the drive home today, I realized that an app I had worked on in the past would have benefited from a graph database.
Without going into too many details (for the sake of brevity, not because I’m afraid you’ll steal my idea) the application was going to have horses that were being fostered & adopted, and people who would be rescuing, fostering, and adopting the horses. The horses would have lots of states (like different vet visits, shots, etc) as well as their current status (rescued, fostered, in training, adopted, deceased etc.)
At one point, I tried doing it in SQL with tables that mapped one person to multiple horses. One horse to multiple vaccination records etc. That seemed to get messy really quick. Then I tried MongoDB and had a single Horse document with all of its relevant information and a separate Person document with all their relevant information, including an array of horses that they owned.
Looking at it from a purely data standpoint, it seems like graph databases might fit the niche here. A horse (vertex) is owned (edge) by a person (vertex). Vertex and edge are central aspects of graph databases.
So today I created a real simple OrientDB database and threw a couple records in there and played around. It was actually fairly simple and seemed to confirm my initial feelings that maybe this would be a good match.
As I thought about this more and more I realized something. My mind tends to think primarily in an RDBMS way. What I mean by that is, I think “This entity connects to this entity” and I almost visually picture a DB diagram of SQL joins etc. Even if that’s not the best model. I think that is true of a lot of developers, certainly most developers who work with a database in a corporate environment.
I’m slowly starting to think of database records as documents, but that’s not the right answer.
Instead, I think the right answer is to look at how the data in your application interacts with other data in your system and then ask the question “What is the best way to represent this?” The danger is, a lot of times that question gets asked with the silent assumption being “What is the best relational way to represent this?”
It’s a lot more work, but what if we looked at the data and then spent a little bit of time searching to see if there was any existing technology that matched what we were trying to do. It’s hard, because sometimes we don’t even know what question to ask. However, the more I’m exposed to different options (RDBMS, document store, graph database) the more it helps to break me out of the mindset of a database table that knows about another database table.