Sunday, 23 February 2014

Supported Data Types in Mongo DB

Hi Friends, we are back. Excuse me for the delay as was tied up with project deliverable (an integral part of a software engineer's life). Today we shall discuss about different data types supported in Mongo DB.


Limitations in JSON data types

As we mentioned earlier as well, document in MongoDB represents a java script object (JSON). Though JSON's structure is simple, easy-to-understand and parse there are limitations in it that it supports only six data types - null, boolean, numeric, string, array and object.

You might think, these should be sufficient at a high level to express different structures of data. But there are still few additional types which are pretty important. For example JSON doesn't have any data type to work with dates. That might be real difficult to accept specially when it is used for a database's core data types.

There is a numeric data type but that doesn't include the differentiation between floats or integers. Even that doesn't specify the distinction between 32-bit or 64-bit numbers.

It doesn't have another important data type for regular expression as well.



MongoDB's Additional Support

MongoDB adds additional support on the existing JSON data type in order to make it more flexible and a wide range support. While developing the additional support the original format of JSON of having key-value pair has been retained. The commonly supported fields and how they are represented in a document is described below:

null

Null can be used to represent both a null value and a nonexistent field:
{"x" : null}

boolean

There is a boolean type, which can be used for the values true and false:
{"x" : true}

number

The shell defaults to using 64-bit floating point numbers. Thus, these numbers look “normal” in the shell:

{"x" : 3.14}

or:

{"x" : 3}

For integers, use the NumberInt or NumberLong classes, which represent 4-byte or 8-byte signed integers, respectively.
{"x" : NumberInt("3")}
{"x" : NumberLong("3")}

string

Any string of UTF-8 characters can be represented using the string type:
{"x" : "foobar"}

date

Dates are stored as milliseconds since the epoch. The time zone is not stored:
{"x" : new Date()}

Javascript's Date object is used in MongoDB. While creating a new Date object always call new Date() and not just the constructor Date(). Calling the just the constructor will return the date as a string representation and not the actual date object. This is how javascript works. So please be careful to always call new Date() to avoid any mismatch between string and the actual date object.

Did you find it confusing? No worry we will clear this in an example:

Run your mongod instance and connect to it using the mongo shell. Use any database of your choice (say test). Type the following command:

> use test

Now insert a document using new Date() constructor in the test db in datefld collection (for example)

> db.datefld.insert({_id:1, createdOn: new Date()})

Now insert a document using Date() function in the same collection:

> db.datefld.insert({_id:2, createdOn: Date()})

Find both the documents and see the difference in the createdOn field:

> db.datefld.find().pretty()
{
    "_id" : 1,
    "createdOn" : ISODate("2014-02-23T14:42:56.883Z")
}
{
        "_id" : 2,
        "createdOn" : "Sun Feb 23 2014 20:13:25 GMT+0530 (India Standard Time)"
}

The new Date() constructor (document with _id:1) return a date object while the Date() function call (document with _id:2) returns a string representation of the date.

While showing the date in the shell it is shown as ISO date with the local time zone settings. Though it is stored in the database as a millisecond value since the epoch. So they don't have any timezone information. As a workaround you can store the timezone information in a separate field.


regular expression

Queries can use regular expressions using JavaScript’s regular expression syntax:
{"x" : /foobar/i}

array

Sets or lists of values can be represented as arrays:
{"x" : ["a", "b", "c"]}

Arrays in MongoDB can represent both ordered (stack, list or queue) as well as unordered (set) operations. Following is An example on an array in MongoDB:

{"address" : 930, "Casanova Avenue", "CA", 93940, 10.5}

As you can an element of an array can be of any type supported in MongoDB (in the above example integer, float and string types are mentioned). It can also contain a nested array as an element.

One of the advantages of having an array in a document is that MongoDB can understand its structure very well and can reach to a specific elements in order to query/update/delete etc. For example the average temperature (10.5) in the above document can be changed easily. Also MongoDB can create indexes on the arrays elements.



Embedded document

Documents can contain entire documents embedded as values in a parent document:
{"x" : {"foo" : "bar"}}

Documents can be used as the value for a key. This is called an embedded document.Embedded documents can be used to organize data in a more natural way than just a flat structure of key/value pairs. For example, if we have a document representing a person and want to store his address, we can nest this information in an embedded "address" document:

{
"name" : "Pradosh Chandra Mitra",
    "address" : {
        "street" : "21 Rajani Sen Rd",
        "city" : "Kolkata",
        "state" : "WB"
    }
}

The value for the "address" key in the previous example is an embedded document with its own key/value pairs for "street", "city", and "state". As with arrays, MongoDB “understands” the structure of embedded documents and is able to reach inside them to build indexes, perform queries, or make updates.



object id

An object id is a 12-byte ID for documents. Our next discussion will describe this in detail:
{"x" : ObjectId()}


There are also a few less common types that you may need, including:

binary data

Binary data is a string of arbitrary bytes. It cannot be manipulated from the shell. Binary data is the only way to save non-UTF-8 strings to the database.

code

Queries and documents can also contain arbitrary JavaScript code:
{"x" : function() { /* ... */ }}



<< Prev                                                                                     Next >>

No comments:

Post a Comment