Mongodb – Understanding how journal mode works with j:true

One of things that has recently confused me with Mongodb while using Mongoid with the Moped driver is the use of the journal mode and the j:1 write concern.

Journal Mode

The journal mode ensures that your writes are written to a redo/write ahead log after its written to the memory but before its pushed to disk. It provides a level of durability where if the server were to crash before certain data is written to disk, on restart this data is read from the journal logs and written to disk. Thus making your data more resilient.

Journaling is enabled on all 64 bit installations since 2.0 as a default but not on the 32 bit ones.

What kind of stuff does journaling record

Journaling takes care of

  • Create/update commands
  • Index creation
  • namespace changes.

Journal Commit Interval

The journal commit interval (journalCommitInterval) is the time interval between subsequent writes to the journal files and is set 100 ms as a default. This is true if your journal files and data file are on the same server. Its 30 ms if they aren’t on the same machine.

This means you can lose data that has not yet been journaled in the last 100ms. You can update the journalCommitInternval setting in your mongodb.conf file to anywhere between 2ms to 300ms. A lower value will increase the frequency with which data is journaled but at the cost of disk performance.

To force mongodb to commit sooner to the journal without updating the journalCommitInterval setting one could use the j:1 write concern. As per the docs

When a write operation with j:true is pending, mongod will reduce journalCommitInterval to a third of the set value.

So a low (whatever works best for you) journalCommitInterval with saves called with j:1 would ensure that the system remains consistent.

To verify that your data was written to the journal you could call the

Most drivers run this for you so this isn’t something that you need to worry about.

Mongoid

The mongoid documentation for versions 3.0 and above does not talk about the j:1 write concern for some reason but here is what i found based on this discussion.

This is equivalent to calling it with

This implies that the write will be acknowledged once the data is written to the primary’s memory but not to disk. If you want to trigger a journal write as well then you would have to do it with

Most of the data was collected and posted as an answer on stackoverflow by mnemosyn and is available on the mongodb docs. My goal was to create a quick reference.

References:

  1. http://docs.mongodb.org/manual/reference/write-concern/
  2. http://docs.mongodb.org/manual/core/journaling/
  3. http://docs.mongodb.org/manual/tutorial/manage-journaling/
  4. http://docs.mongodb.org/manual/reference/configuration-options/#journalCommitInterval
  5. http://docs.mongodb.org/manual/core/journaling/
  6. http://docs.mongodb.org/manual/core/write-concern/#write-concern
  7. http://rsmith.co/2012/11/05/mongodb-gotchas-and-how-to-avoid-them/
  8. https://groups.google.com/forum/#!msg/mongoid/Bybccfc0gLI/Aei0VkMAZqoJ

Mongodb – not evil, just misunderstood

Lately I’ve been reading a lot about Mongodb and posts dissuading you from ever using it. Some of these articles are seriously outrageous and make me wonder what got the team to actually start using Mongodb in the first place. Sarah Mei’s recent article was one such that upset me a lot, especially since the title was so inflammatory.

My post however, aims at highlighting the areas where Mongodb works and how it performed brilliantly for us. As someone leading the engineering efforts for a shipping and logistics company I wasn’t too happy initially to see Mongodb being used as the primary datastore but after 2 years I’m more than sure that this was definitely the datastore for us. I’ve outlined areas that confused me when i first encountered them only to learn that they were actually invaluable features that were available to me.

“No migrations” is that all you have?
The advantages of schemaless documents are priceless. Not having to migrate is just one of the perks. Our schemas were largely in the form of Orders (having many) Shipments (going_from) ShipPoint (to) ShipPoint

We rarely used most of these entities without the other and it just served us extremely well to manage them as self contained documents embedding the other.

Mongodb writes are fire and forget? WTF?
This doesn’t always have to be the case, though it significantly contributes to Mongodb’s fast writes. Mongodb’s write concerns configurations allow you to configure the precise level of persistence that needs to be achieved to call it a successful write. So if the write fails you know its failed. The fact that you could know if your write has migrated to replicas or has been journaled is a pretty neat feature.

How can the default writes be fire and forget?
(Version – 2.4.8 changes this however this is valid till version 2.2.6)
It just made sense, given all the information to configure it the way you prefer I would always go with this approach. We add a lot of Notes to each shipment as it gets reviewed at different levels by the sales, accounts and other teams. These notes generally serve as a reminder or a single line indicating that its been viewed – though it doesn’t critically affect the business workflows of the application. Its just seemed logical that these were fire and forget operations and could be stored as quickly as possible.

Another place where this is extremely handy for us is during Tracking. We track several hundreds of shipments each day logging every tracking status, location and time the shipment has been, while in transit. This information is handy for customers to keep an eye on where their shipment has reached. Chances are when fetching this information some of the information is not saved the first time – but we expect that it would be obtained during a second fetch 30 minutes later. The default write concern works brilliantly then.

Read locks and write locks – don’t they slow you down
They do but since most of the stuff is memory mapped this doesn’t affect you in a major way. However, I did notice people always working the primary of a replicaSet and never querying the secondaries for fear of inconsistent data. I think if you have sufficient memory your replication lag would be pretty small and besides if you don’t need the data to be consistent every instant querying secondary is a sensible option to reduce the load off your primary. Which brings me to the PrimaryPreferred Read preference. This allows you to query a secondary in your replica set when your primary is not available. It’s a fairly safe choice in my opinion.

We began querying secondaries for ShipPoints which didn’t change that often.

All the memory usage is killing me!
This is one of the things I that took me time to accept. Mongodb expects that your working set fits into RAM along with the indexes for your database. Your working set is the data that is frequently queried and updated. Since mongodb works with memory maps most of your working set data is mapped to the memory. When this data is not available in memory a page fault occurs and your data needs to be fetched from disk. This results in a performance penalty but as long as you have some swap space you can safely load the data back in.

While our working set was fairly small our reporting application needed access to the entire shipment records to generate reports. This resulted in Mongo running out of memory and spitting OperationFailure errors on a regular basis.

Our initial approach was naive and we started using Redis(which is another datastore thats pure gold.) to store snapshots of information but soon realised we could just use Mongodb to make it work.

So can I never generate reports without having my dataset fit in memory?
Rollups to the rescue. Rollups are pre-aggregated statistical information that help you speed up your aggregation process. This makes life significantly easier as you query for short time ranges to generate micro-reports.

Here is a simplified snapshot on how we generated daily and monthly aggregates with mapreduce.

So you mean this can’t be realtime?
Yes it can – through atomic updates. Just like we generated rollups to speed up reporting we can generate pre-aggregated snapshots of aggregated information like this.

Once this is in place you can update your aggregates by simply incrementing the right counter with something like

I haven’t even touched upon the replication and sharding features that Mongodb offers which I will reserve for another post. To summarise I feel Mongodb is awesome and is a lot like the kid in class who you dismissed because your friends thought he was weird – till you got to know him.

Disclaimer: I don’t claim to be an authority on Mongodb and everything that I have written about is stuff that I’ve learnt while working with Mongodb. I recommend reading the documentation and going through the talks available on the Mongodb website.

Simple Quick SSH Tunneling to expose your web app

I’ve been a localtunnel user for quite some time now and I really love the fact that its a free service, quick to install and easy to expose your development app to the world. There are quite a few worthy alternatives now with showoff.io and pagekite that pretty much do the same thing. 

But at times it gets annoying (especially when in the middle of other work) that I’m unable to access localtunnel because its down or I outran the free usage limit for Pagekite

I generally end up using localtunnel when I have to wait for an IPN from Paypal or Authorize.net (relay response) while working on my development environment. So here is a quick way to roll out a basic version of the service for your own needs.

Before we move to the “how to” here is a quick intro to ssh tunneling

Now, I assume that you have a staging server (or some server you have ssh access to).

The following terminal command does pretty much the same thing we do with the Ruby code that follows:

The -R indicates that you’ll be forwarding your local app running on port 3000 to the remote port 9999 on the remote host abc.example.com so that everyone can access it. Now your application running on localhost:3000 is accessible at abc.example.com:9999

We do the same thing now using the ruby net-ssh library. The following code snippet is customised to my defaults but its simple enough to change those settings.

Hope this helps

Mongoid – Comparing attributes belonging to the same document

I spent an insane amount of time trying to figure out how to elegantly compare attributes belonging to the same document and stumbled upon a few stackoverflow questions that helped. However the solution still doesn’t appear elegant enough but does the job.

scope :expired, any_of({:completed_date.ne => nil}, 
                       {:expiry_date.lte => Date.today.to_time}, 
                       {:usage_count => {"$gte" => "this.max_usage_count"}})

Certainly not noteworthy but this post is to help me document this bit.

Goal

To build a named scope to allow me to query expired documents. The model has the following fields


	field :code, :type => String
	field :allowed_domain, :type => Integer, :default => DOMAIN_ALL_SITES
	field :description, :type => String
	field :discount_type, :type => Integer, :default => PERCENTAGE_DISCOUNT
	field :dollar_discount_cents, :type => Integer, :default => 0 
	field :percentage_discount, :type => Integer, :default => 0
	field :max_usage_count, :type => Integer, :default => 1
	field :usage_count, :type => Integer, :default => 0
	field :expiry_date, :type => DateTime, :default => (DateTime.now + 1.week)
	field :completed_date, :type => DateTime	

	index :code
	index :expiry_date
	index :allowed_domain

The precondition for an expired record is that its completed_date should be nil , the expiry_date should be less than the present date and usage_count must be less than the max_usage_count

While the above block of code does that, what I hate is the mixing JSON and Ruby. While mongoid for most purposes does a good job in making life easier (as every ORM should), there are occasions where I’d prefer it let me write pure Mongodb queries or use the Ruby wrappers entirely.

Other Resources
http://stackoverflow.com/questions/3795044/mongoid-named-scope-comparing-two-time-fields-in-the-same-document

Further bulletins as events warrant.

My Favorite Ted Talks

I’ve been going through some of my favorite Ted Talks, here’s the list.

1. Sir Ken Robinson on Creativity.

2. Simon Sinek, People don’t buy what you do, they buy why you do it

I love this talk just for that one line which just resonates with everything I believe in.

3. Dan Ariely

Lastly, to my favorite speaker Dan Ariely. I’ve been a big fan of his books and talks. I strongly recommend you read Predictably Irrational. Here are his talks

I can’t believe I missed this the first time around.

4. Derek Sivers – How to start a movement.

Async Responses in Thin and a Simple Long Polling Server

I’ve been playing with Thin, the Ruby web server which packages the awesomeness of Rack, Mongrel and Eventmachine (\m/). However, the thing that blew me away completely was the James Tucker’s Async Rack that was integrated to Thin. The combination opens a whole new world of realtime responses, something that I’ve been constantly switching to NodeJS and SocketIO for.

Async Rack

A simple rack application would look like this


# my_rack_app.ru

class MyRackApp
  def call(env)
    [200, {'Content-Type' => 'text/html'}, ["Hello World"]]
  end
end

run MyRackApp.new

# thin start -R my_rack_app.ru -D -p3000

This simple rack app on hitting port 3000 responds with the a status code (200), the content-type and some body. Nothing special. With async rack we’d be able to send the response head back while we build the response. Once the response is built we send the body and close the connection.

A quick glance at Patrick’s Blog (apologies for not being able to find the last name of the author of this blog) should give you an excellent understanding of what I’m trying to explain

A simple async app


# my_async_app.ru

AsyncResponse = [-1, {}, []].freeze

class MyAsyncApp
  def call(env)
    Thread.new do
      sleep(10)
      body = [200, {"Content-Type" => "text/html"}, ["<em > This is Async </em >"]]
      env['async.callback'].call(body)
    end

    AsyncResponse
  end
end


The AsyncResponse is a constant which returns a -1 status which tells the server that the response will be coming later asynchronously. It, as defined in the examples provided in thin an “Async Template” Async Template

On a request the code initially returns an AsyncResponse while the thread waits on the sleep request. Once the thread is active again it builds the response and sends the response keeping the connection alive.

An async app where we build the response and close the connection

From here on, it would really help to have the thin code open in your text editor. We would be looking at the request.rb, response.rb and the connection.rb methods from /thin folder.


require 'rubygems'

class DeferrableBody
	include EventMachine::Deferrable

	def each &blk
		puts "Blocks: #{blk.inspect}"
		@callback_blk = blk
	end

	def append(data)
		puts " -- appending data --"
		data.each do |data_chunk|
			puts " -- triggering callback --"
			puts @callback_blk.call(data_chunk)
		end
	end	
end

class AsyncLife
	AsyncResponse = [-1, {}, []].freeze
	
	def call(env)
		body = DeferrableBody.new

		EM.next_tick {
			puts "Next tick before async"
			sleep(3)
			puts "#{env['async.callback']}"
			env['async.callback'].call([200, {'Content-Type' => 'text/plain'}, body])
			sleep(5)
		}

		puts "-- activated 5 second timer --"
		EM.add_timer(5) {
			puts "--5 second timer done --"
			body.append(['Mary had a little lamb'])

			puts "-- activated 2 second timer --"
			EM.add_timer(2){
				puts "--7 second timer done --"
				body.append(['This is the house that jack built'])
				puts "-- succeed called -- "
				body.succeed
			}

		}


		AsyncResponse
	end
end

run AsyncLife.new

This example uses Eventmachine Deferrable which allow us to determine when we’re done building our response. But the tricky part I struggled to get my head around was the strangle looking each method and how it uses the append method to build our responses.

Walking through the code

When the request comes in from the browser the application renders the AsyncResponse constant which tells the server that the response body is going to be built over time. Now on request eventmachine’s ( checkout thin/connection.rb ) post_init method creates a @request and @response object. This triggers a post_process method which on the first call returns without setting the header, status or body and prepares for the asynchronous response.

On next_tick we begin to create our header. We initialize a EM::Deferrable object which is assigned as the response body and this ensures the header is sent ahead of the body (because we don’t have anything to iterate over in the each block where the response is sent) The env['async.callback'] is a closure for the method post_process which is created by method(:post_process) checkout the pre_process method in the thin/connection.rb

The each method in our Deferrable class overrides the default each implementation defined by the response object. So now our each block is saved in an instance variable @callback_blk which is call ed when we call the append method. So essentially we are calling send_data on each of the data blocks we’re sending back when we call the append method.

Once that’s done we call succeed which tells Eventmachine to trigger the callback to denote we’re done building the body. It ensures the request response connection is closed.

The default each implementation


 # Send the response      
      @response.each do |chunk|        
        trace { chunk }
        send_data chunk
      end

Thats pretty much what I gathered by going through the code on async response with thin and rack. Another useful module is the thin_async bit written by the creator of thin @macournoyer available here . This pretty much abstracts the trouble of overriding the each block.

Here’s an example of a simple long polling server I built using thin available here

Hope this is helpful

Tagging with Redis

Its been a long time since my last post and this one is about Redis. So I’ve been working on this project for a bit now and I had this story to implement where I had to associate posts with other posts which were identified to be similar based on a tagging system that already existed. The trouble here was that the existing tagging system was closely tied to a lot of other functionalities and couldn’t be easily (quickly) re-written. The feature needed to be rolled out quickly for several reasons which I won’t get into.

The Problem

The tags associated to each posts were poorly organized where one record in the Tag model would hold all the tags associated to the post as a whitespace separated string (ouch!)


posts.tags.first.name # business money finance investment

So to find posts that had 2 tags in common from a database of approximately 2000 posts with at least 4 tags each took a significant amount of time. Here is a basic benchmark of just finding the matches on each request.


Benchmark.measure do
 similar_posts = []
 Post.tagged_posts.each do |post|
   similar_tags = (post.tags.first.name.split & current_post.tags.first.name.split)
   similar_posts << post if similar_tags.size >= 2
  end
end

Here is what benchmark returns


 => #Benchmark::Tms:0x111fd8cb8 @cstime=0.0, @total=2.25, 
@cutime=0.0, @label="", @stime=0.19, @real=4.79089307785034, @utime=2.06

Not Great.

So the next option was to pre-populate the related posts for every post and store it in the database as similar_posts. So all that was required was to fetch the post with its ‘similar_posts’ at runtime. This seemed like an acceptable solution considering the tags were not changed on a regular basis but if changed it would require the similar_posts table to be rebuilt again(which took a long time). Here is the benchmark for fetching the pre-populated data from the database


Benchmark.measure { p.similar_posts }
=> #Benchmark::Tms:0x104d153f8 @cstime=0.0, @total=0.0100000000000007, 
@cutime=0.0, @label="", @stime=0.0, @real=0.0333230495452881, @utime=0.0100000000000007

Nice! But this came at the cost of having to rebuild the similar_videos every time something had to be changed with tags or videos.

Redis

Redis is this awesome in memory key-value, data-structure store which is really fast. Though it would be wrong to think of it as a silver bullet it does a lot things which are really awesome. One is the ability to store data structures and perform operations on them, PubSub and a lot of other cool stuff. It even allows you to persist the information using a snapshotting technique which takes periodic snapshots of the dataset or by “append only file” where it appends to a file all the operations that take place and recreates the dataset in case of failure.

In my case I didn’t need to persist the data but maintain a snapshot of it in memory. So assuming some basic understanding of Redis and the redis gem I’ll describe the approach.

We created a SET with each tag name as key in REDIS so that every tag contains a set of the post_ids of all posts that have that tag. Inorder to identify a post having two tags in common all that was needed was the intersection of tags sets and REDIS provides built methods for these operations. Thats it!

 
{"communication" => ['12','219', .. , '1027']}  #sample SET

Fetching the similar posts


def find_similar_posts 
  similar_posts = []
  if self.tags.present? && self.tags.first.present?
    tags = self.tags.first.name.split
    tags_clone = tags.clone
    tags_clone.shift
    tags.each do |tag| 
      tags_clone.each do |tag_clone|  
        similar_posts << REDIS_API.sinter(tag.to_s, tag_clone.to_s)
      end
      tags_clone.shift        
     end
  else
    puts "No matching posts."
   end                          
 similar_posts -= [self.id.to_s]
 Post.find(similar_posts)  

Benchmark


 >> Benchmark.measure { p.find_similar_posts }
Benchmark::Tms:0x1100636f8 @cstime=0.0, @total=0.0100000000000007, @cutime=0.0, @label="",
@stime=0.0, @real=0.163993120193481, @utime=0.0100000000000007
 

Which is pretty good considering that we do all the computation within this request and nothing is pre-populated.