Aaronontheweb

Hacking .NET and Startups

Code Camp Talk: RavenDB vs MongoDB

June 27, 2012 06:19 by Aaronontheweb in MongoDB, RavenDB // Tags: , , // Comments (1)

This past weekend at SoCal Code Camp I presented a session along with my friend Nuri Halperin entitled “Battle of the NoSQL Databases: RavenDB vs. MongoDB.”

I represented the RavenDB team, having used it in production now for a couple of months (and ditched Mongo to do it.) I’ll blog more about the specifics of RavenDB and what it’s awesome at some point in the future, but nevertheless I wanted to post my slides here so you could see the bullet-by-bullet comparison between the databases.

We didn’t cover everything, but we did try to capture all of the high-level details:

NoSQL Shootout: RavenDB vs MongoDB
Update: Some errata that has been pointed out to me courtesy of Itamar Syn-Hershko of the Hibernating Rhinos team.
 Raven actually uses BSON internally as well, and has no auto-sharding support by design, see 

If you enjoyed this post, make sure you subscribe to my RSS feed!



MongoDB vs. SQL Server 2008: A .NET Developer’s Perspective

June 30, 2011 16:01 by Aaronontheweb in ASP.NET, MongoDB, SQL Server // Tags: // Comments (22)

One of the first projects I put together this year was Captain Obvious, a nifty little application that runs off of AppHarbor and ASP.NET MVC3. What made Captain Obvious special for me was that it was my first time using something other than a relational database1 in production – I chose MongoDB because it stands out to me as a lightweight, easy-to-work with store that’s easier to use for most CRUD applications. Since then I’ve gone on to build other projects which depend on Mongo.

What I’ve learned since is that MongoDB and SQL Server are tools that aren’t 100% interchangeable and are more situational than dogmatists make them out to be.

My goal in writing this is to help inform you on how you should decide to judge these two technologies as options for your ASP.NET / WebMatrix / WCF applications.

Relational Data Models vs. Document Models

The key to using Mongo or SQL Server effectively is understanding how the underlying data model works and how this impacts your ability to read / write what you want when you want to the datastore. Below are illustrations of the relational (SQL) model versus the document model.

mongo vs sql differences

 

In a relational model all of your information is expressed in the form of tables, all of which contain keys and some of which contain foreign keys to other tables. All of the information you read from and write to the database is expressed as either adding rows to these tables or combining their values based on some keys.

In a document model you have a relatively flat collection of items all identified by one primary key2, and instead of defining a relationship between two collections you simply embed one inside the other. So the relationship between the three tables in our relational model is expressed as a single, flat document in this model.

It is important to note here that there’s no schema that tells MongoDB what fields should or shouldn’t be expected in each document in the “Things” collection – in the relational universe each table is strongly, declaratively typed and every single item inserted into the row must conform to all of the constraints imposed by the relational database management system (RDBMS.) Anything goes in Mongo (although this can cause major problems, as we will see later.)

What Do Relational and Document Models Have in Common?

So what do these two data models have in common?

  1. Both support the notion of primary keys and indexes, and MongoDB can support multiple indices if needed;
  2. Both support queries and have models for sorting / limiting results;
  3. Both support the ability to reference other documents / tables (“wha? in Mongo? Yup.”)
  4. Both have a strong typing system; and
  5. Both support aggregation operations like SUM(), COUNT(), etc…

 

Seems pretty straightforward, right? But what’s up with documents being able to refer to each other?

As it turns out, implementing a database where every possible piece of information of interest to users is all embedded inside of its own distinct document comes with some drawbacks, so the creators of MongoDb added support for DbReference and the ability to do cross-collection references as well. A necessary evil in the name of practicality.

What’s Different between Relational and Document Models?

So what’s really different between the two models?

  1. Document models don’t have SQL – their query tools are extremely primitive in contrast (but the models are also much simpler and don't require sophisticated queries;)
  2. Fetching document references in the document model has to be done inside of separate queries3 whereas they can be done all in the same transaction in the relational model;
  3. Document models don’t have a schema – each document in a collection can have extra fields or fields with different type values, whereas all rows must conform to the same set of constraints in order to be inserted;
  4. In the document model types are associated with data upon assignment, rather than declared in advance;
  5. Document models have much more primitive tools for performing aggregations (edit: actually, not necessarily true for Mongo - it has built-in MapReduce which is powerful and sophisticated;) and
  6. Queries are defined on a per-collection basis in the document model, whereas a query in the relational model is an abstraction that simply refers to any number of related tables.

 

The picture that should be starting to form in your head of MongoDb vs. SQL Server at this point is a flat, simple system on one side and a multi-dimensional, rich system on the other. This is how I look at the two technologies in contrast to each other.

What’s Different about Developing .NET Apps Against Mongo and SQL Server 2008?

So at the end of the day, what's the bottom line for .NET developers who want to use Mongo or SQL Server in a web application? What are the REAL trade-offs?

Mongo IS the OR/M

The core strength of Mongo is in its document-view of data, and naturally this can be extended to a "POCO" view of data. Mongo clients like the NoRM Project in .NET will seem astonishingly similar to experienced Fluent NHibernate users, and this is no accident - your POCO data models are simply serialized to BSON and saved in Mongo 1:1. No mappings required.

I'm going to show you two similar pieces of source code I am using in production - one using NoRM and MongoDb on CaptainObvious and the other using Dapper and SQL Azure on XAPFest:

NoRM:

public IdeaResultSet GetIdeasByAuthor(int authorID, int offset = 0, int count = 10)
        {
            using (var db = Mongo.Create(ConnectionString))
            {
                var ideas =
                    db.GetCollection().AsQueryable().Where(x => x.AuthorReference.Id == authorID)
                    .Skip(offset)
                    .OrderByDescending(
                        x => x.DatePosted).Take(count).ToList();

                var totalIdeas = db.GetCollection().AsQueryable().Where(x => x.AuthorReference.Id == authorID).Count();

                //Fetch the referenced authors before we serve up the list again
                foreach (var idea in ideas)
                {
                    idea.Author = idea.AuthorReference.Fetch(() => db);
                }

                var resultSet = new IdeaResultSet { Ideas = ideas, 
                    PaginationValues = new PaginationTuple { 
                        MaxPages = PageCounterHelper.GetPageCount(totalIdeas, count), 
                        CurrentPage = PageCounterHelper.GetPageCount(offset, count) } 
                };

                return resultSet;
            }
        }

Dapper:

public IList<XfstApp> GetAppsByUser(string userName)
        {
            using (var conn = new SqlConnection(ConnectionString))
            {
                try
                {
                    conn.Open();

                    var appResults = conn
                        .Query(@"SELECT * FROM Apps
                                            INNER JOIN AppOwners ON AppOwners.AppName = Apps.AppName
                                            WHERE LOWER(AppOwners.UserName) = LOWER(@UserName)
                                            ORDER BY Apps.DateCreated ASC", new { UserName = userName });

                    //Return whatever we were able to collect
                    return appResults.Select(x => x.ToApp()).ToList();
                }
                catch (SqlException ex)
                {
                    TraceError(ex);
                    return new List<XfstApp>();
                }
                finally
                {
                    //Close the connection when we're finished, regardless of what happened
                    conn.Close();
                }
            }
        }

 

The amount of source code for these two technologies or the nature of it isn't wholly different.... What is drastically different is how I was thinking about the data when I was writing this code - I can save an instance of an Idea object to MongoDb on CaptainObvious without ever having created the collection first or defined a schema.

Whenever I want to look up an idea, I just pick one based off of a key value that I specify and I don't worry about any joins or anything (although I do have to load objects from the author collection if I need to display the author's name and contact info.)

In the SQL universe, I have to define my tables in advance and each time I want to extract an object, I have to think of it in terms of combined relationships between my tables and this requires a more thoughtful approach.

Mongo, in other words, lends itself to rapid application development whereas SQL Server has some innate friction built into any schema-based system.

In Mongo, the DBMS Isn't There to Protect Your Data's Integrity

One of the major advantages of a schema-based DBMS is that if the data a calling application tries to insert something that doesn't fit the schema into a row, the operation always fails. In Mongo, this isn't true - you can have one record in a collection with extra fields or fields of an odd type, and it can totally screw up the BSON serializer when it tries to process the collection (depending upon how flexible the serializer is.)

SQL users take this for granted, but when you have issues in Mongo along these lines they can be really frustrating to solve and difficult to debug.

In Mongo, Operations May Not Be Atomic

Operations are not atomic in Mongo by default, so all sorts of fun things can happen when you have multiple users changing properties on the same document. You can set an atomic flag to true, but even then operations still aren't really atomic (they're written from memory to disc in bulk.)

If you use Mongo and carry the same ACID assumptions that we learned on SQL Server, you might be in for a nasty surprise :p

Conclusion

Overall, the biggest difference between these two technologies is the model and how developers have to think about their data. Mongo is better suited to rapid application development, but in my opinion falls apart in scenarios where ACID-compliant systems are a must, like anything that goes anywhere near a financial transaction.

But, that's just my opinion :p


1typically I’ve only used SQL Server / MySQL in the past

2You can add indices to other fields in Mongo too, but that’s outside the scope of this article

3unless I am doing it wrong, which I may well be doing

If you enjoyed this post, make sure you subscribe to my RSS feed!



How I Built CaptainObvio.us

Captain Obvio.us - a place to share ideasI made a tiny splash on Hacker News a month ago when I asked for feedback on my newest side project, CaptainObvio.us – a simple portal for sharing ideas and soliciting feedback from a community of peers. The idea was popular and I’ve received a ton of feedback – I’ve implemented most of the Hacker News community’s suggestions but haven’t had the chance to do another round of customer development.

What I wanted to share in this blog post was some of the secret sauce I used for creating CaptainObvio.us – I originally created it mostly to learn MongoDB, and learned way more than that along the way.

Webstack: ASP.NET MVC3 on AppHarbor

I used ASP.NET MVC3 as my webstack of choice with AppHarbor as my hosting platform. ASP.NET MVC3 is a massive improvement over MVC2, and I took advantage of Razor syntax, the built-in support for DI (dependency injection) on controllers, and wrote a number of customized helpers to do things like create an action filter for Twitter @Anywhere.

AppHarbor has been a great experience to work with - I use Git for souce control for most of my personal projects like this one so deployments are a breeze on AppHarbor, but the other major reason I picked AppHarbor is that it shares the same Amazon AWS datacenter as MongoHQ - another [Thing]-as-a-Service that I used for hosting my MongoDB instance.

Data: MongoDB on MongoHQ

The original purpose of CaptainObvio.us was for me to learn MongoDB, a schemaless (NoSQL) document database written in C++ that is becoming all the rage in the Ruby on Rails universe. CaptainObvio.us is a good fit for Mongo given that the vast majority of its content consists of simple documents with a small amount of relational data for tying authors to ideas / comments and so forth.

I could not have gotten very far with MongoDB in C# were it not for the NoRM Project MongoDB drivers for C# - NoRM's drivers are much better than the default MongoDB drivers for C# and work similarly to LINQ-to-SQL (although not exactly.) It was a matter of hours for me to go from installing Mongo to having a functional site running with NoRM and ASP.NET MVC3.

Authentication: Originally Twitter @Anywhere; Sign-in-with-Twitter and Hammock Later

Twitter @Anywhere is a fairly new JavaScript-only framework for integrating Twitter into existing websites quickly and effortlessly - it automates all of the OAuth workflow, comes with tons of useful built-in widgets, and eliminates the need to write much (if any) plumbing needed to support Twitter integration.

This is all nice in theory but if you're building a site like CaptainObvious where your users use Twitter to sign-in and leave behind persistent data in your own data store, then this framework can really cause problems. I had to gut Twitter @Anywhere eventually because the post-authentication event hook would misfire on occasion due to issues on the client and thus I would have an "authorized" user running around adding comments and voting despite no record of them existing in the database.

In order to resolve the issue, I dumped Twitter @Anywhere and went with traditional Sign-in-with-Twitter powered by Hammock, which works fine.

Final Thoughts

I'm going to continue working on CaptainObvio.us, although I put it off largely due to all of the work I had to do engineering XAPFest's online presence on Windows Azure. If the project taught me anything, it's the value of continuous integration environments like AppHarbor and the ease with which you can get something up and running quickly with MongoDB.

I'd highly recommend checking out AppHarbor, MongoHQ, and the NoRM drivers if you're a .NET developer who's new to Mongo and wants to learn how to use it. I guarantee you that you'll appreciate traditional .NET databases like SQL Server 2008 and SQL Azure a lot better after you've worked with Mongo, as it'll help you learn the strengths and weakness of each respesctive platform.

If you enjoyed this post, make sure you subscribe to my RSS feed!



Search

About

My name is Aaron, I'm an entrepreneur and a .NET developer who develops web, cloud, and mobile applications.

I left Microsoft recently to start my own company, MarkedUp - we provide analytics for desktop developers, focusing initially on Windows 8 developers.

You can find me on Twitter or on Github!

Recent Comments

Comment RSS

Sign in