Skip navigation

Category Archives: design

untitledCodeCademy approach: Separate the site in 3 layers so you can optimize and care in different areas in any case

Client Layer – HTTP sizes, timing, assets sizes, JS, CSS
App Layer – Controllers, responses, group things like: give me an “user”.  Store session in redis (http://redis.io/).
Data Layer – Encapsulate access patters, will serve a “user” that can contain a lot of collections.

In small docs, the names can be the 40% of the total size. ..

They used a cool way to guarantee uniqueness of the pair user,exercise they used a compound index {user_id, exercise_id, UNIQUE} so they do not care about.

In their 3rd version:

CLIENT | API APP | DB LAYER

DB layer with mongoDB does not store the documents(exercise responses), only the Amazon S3 key(metadata), so it does a quick return to the client and the client do the load from Amazon s3 (with Amazon bandwidth) so you can have a more lightweight server and scale with divide the bandwidth.

In sharded configs in amazon with EC2 instances i1, i2, i3.

where i1 is primary and the others secondary. You turn off i2 and switch to the new instance type. when it’s sync you do the same with i1 and later on with i3.

So you are always alive!

 

NOTE: They store the session in redis as a volatile shared memory, so the server is stateless. The client cookie has the UNIQUE session_id that relates the user with the session.

 

Application driven Schema answers:

  1. What pieces of data are we using together?
  2. What pieces of data are used read-only?
  3. What pieces of data are written continuously?
  4. And all the app related questions!

 

We have to remember MongoDB special characteristics:

  • Rich Documents (array of item, another document)
  • Pre join / Embed data,
  • There is no “join”
  • No constrains
  • Atomic operation (non supported transaction)
  • No declared schema, but usually a doc in a collection use to have a similar schema

Always keep in mind: Matching the data access patterns of your application.

 

Thumb up rule: If you find yourself doing the schema design the same way than in a relational SQL database, probably you are not using the best approach

What does Living Without Constraints refer to? Keeping your data consistent even though MongoDB lacks foreign key constraints (at this moment)

 

 

Living Without Transactions

MongoDB is ACID (AtomicityConsistencyIsolationDurability) a life without transactions (I always hated the “redo”)!

It has ATOMIC operations, so nobody will read while you are editing a document. With the rich documents and Embeded docs, you can have the same, Why? Well becouse in an atomic way you are modifying the document in an atomic way and you DO NOT need a transaction at all

And in any case we can be tolerant, will somebody care if you post something and some of your FB friends see it with a 1 second delay?

You can choose:

  • Reestructure, so you can use the ATOMIC operations in your operations.
  • Implement something in software… (buffff).
  • Be tolerant with a little of inconsistency.

 

1:1 relationships example:

Data will be employee – resume ( assuming 1:1)
You can choose having 2 collections, employee and resume collection with the id or embed the resume in the employee (if the resume do not pass the 16 MB) it will depends on:

  • freq of access
  • size of items
  • atomicity of data

 Usually you will embed

1:N (to many) relationships example:

Data can be city -person. If we think in NY City has sense have 2 collections and include the id in the people collection.

 

1:N(to few) relationships example:

Data can be blog -comment in this case usually the best approach will be embed the comments in the blog

 Usually you will embed from the many to the one

 

M:N relationships example:

Data can be books-authors or students-teachers in this case is few to few in both cases usually you embed the id’s in one or another collection (be careful with atomicity) where to place it? It will depend on the way that your app access to the information. Of course you can put id’s in both collections for better performance in some cases.

Usually you will link

Tip: Remember that to embed it has to exists the container, so you cannot insert in some cases…

 

 

Performance:

Benefits of embedding:

  • Improved read Performance (spinning disks takes a lot to be at place. if the info is near it can read more info in the same spin)
  • One round trip to the DB (self explanatory)

MongoDB has multiKey indexes, that’s one of the reasons why is so fast doing searches.

 

from: http://api.mongodb.org/

from: https://education.mongodb.com/courses/10gen/M101J/2013_October/courseware/Week_3_-_Schema_Design/MongoDB_Schema_Design/