Sunday, February 10, 2013

A NoSQL Modeling Philosophy

There has been a lot of digital ink spilled about NoSQL, including my me. Lately, there has been some confusion and complaints. I have a simple model designing philosophy that I hope will satisfy at least some enthusiasts as well as critics of NoSQL. I have been mostly dealing with document-oriented NoSQL, but this might be applicable to other architectures. Here it is...


  • If the actual information might be duplicated by objects/documents (or whatever corresponds to relational rows), put that data in a separate collection/table, putting document IDs where needed by "main" objects. I believe this is called normalization. 
  • If the actual information is unique to that object, it belongs in that object... even if that means using arrays. Sub-structures might need sub-arrays and/or the IDs of other objects. 
What might such a model look like? My favorite example is a modern music collection that consists mostly of albums by specific bands (as opposed to various-artist albums). 

First, lets consider what a band/artist might have. The obvious is a name. Another might be a location. Another obvious thing a band would probably have is albums. Now, band might share a location, but aren't likely to share an album. So, locations would be stored as separate objects pointed to by band objects. Albums of a band, on the other hand, would be stored as an array in that band object. 

Now, albums have (mostly) unique names and release dates that are fairly random. So, an album structure would have a name and a release year. Looking up albums by genre can often be useful, and genres would be things albums of different bands would have in common. So, if your API allows, create genre objects and point to them from albums. So, individual albums have names, years, and genre IDs. 

So, now we have a band object structure with names, location IDs, and album arrays. We also have location and genre objects. Nothing would stop you from having location ID arrays, if it seems necessary. 

Now, if you have a lot of "various artist" albums, obviously you would want to use a different model. Actually, in that case, a graph-based database might be appropriate. 

I hope this blog clarifies NoSQL modelling. My e-mail address is euric.reiks@comcast.net ; let me know if you have any questions or other examples. If you have a Google e-mail address, you can leave comments on this page. Thanks for reading, and God bless you! 

  

Tuesday, December 18, 2012

Couchbase & .NET

In CouchDB, views (design document) were sent to the server as documents with a specific structure. With Couchbase, views can be edited and saved directly on the console. Suppose you wanted documents based on specific values of one or more elements. I created a view with the following "map":

function (doc, meta) {
  if (doc.firstName && doc.lastName)
  {
  emit([doc.firstName, doc.lastName], doc);
  }
}



Here's how this query works... the program sends an object array consisting of a first name and a last name to Couchbase. It searches the bucket and server node specified in the server instance for documents with elements called "firstName" and "lastName". If a document qualifies, and the incoming data matches that document's specified contents, that document included in the returned collection.
 
But, first things first. In this case, I created a console app, and modified the app.config file to include the following:
 
<configSections>
<section name="couchbase" type="Couchbase.Configuration.CouchbaseClientSection, Couchbase"/>
</configSections>
<couchbase>
<servers bucket="default" bucketPassword="">
<add uri="http://127.0.0.1:8091/pools"/>
</servers>
</couchbase>


These entries go before any other node in the root node. You might need to change the URI, butcket, or password, given your configuration. You can have multiple URIs as well.
If you change your app.config (or web.config as the case might be), you can create a server instance with just

var client = new CouchbaseClient();

In case you're wondering which libraries I added, here they are...

using Couchbase;
using Couchbase.Extensions;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

Before I go any further, I used the following class for my data...

using System.ComponentModel.DataAnnotations;
using Newtonsoft.Json;

namespace ConsoleCouchbase2
{
   public class MyName
   {

       [JsonProperty("FirstName")]
       public string FirstName { get; set; }

       [JsonProperty("LastName")]
       public string LastName { get; set; }
   }
}

Couchbase stored the names of the elements in "camel" format. This caused me some confusion for quite a while.

Anyway, back to the main code. The document Store methods I experimented with were Add and Set. "Add" adds the document if that document ID doesn't exist in the bucket. "Set" adds the document if that ID doesn't exist; if it exists, the old document gets replaced with the new one.  Here is how I stored my docnuments...

var test1 = new MyName() { FirstName = "Ross", LastName = "Albertson" };
var test2 = new MyName() { FirstName = "David", LastName = "Albertson" };
client.StoreJson(Enyim.Caching.Memcached.StoreMode.Add, "1", test1);
client.StoreJson(Enyim.Caching.Memcached.StoreMode.Add, "2", test2);

StoreJson is needed to store structured objects. Store can be used for storing strings or other single units of information. The second argument is the document ID.

Now, to put it all together...

I first made a "key" object containing the data I want to match...

string[] key = { "Ross", "Albertson" };

To find a matching document, I did the following...

var view = client.GetView<MyName>("dev_names", "get_name").Key<string[]>(key);

foreach (var oneName in view)
{
   Console.WriteLine("Found: {0} {1}", oneName.FirstName, oneName.LastName);
}

"dev_names" is the name of the design document; "get_name" is the name of the view. I told  Couchbase to give me a collection of MyName's. The key type is an array of strings. If I wanted all the documents this view was capable of giving me, I would write

var view4 = client.GetView<MyName>("dev_names", "get_name");

foreach (var foundName in view4)
{
  Console.WriteLine("{0} {1}", foundName.FirstName, foundName.LastName);
}

If you wanted to create a view for this specific purpose, emit(null, doc); for a function body would work. If instead you wanted to count the matching documents, creating a view with the same "map", but with _count as the reduce (using just that) will do the job. I did just that with the view name "count_matches". This is the code I used to count my matches...

var view2 = client.GetView<int>("dev_names", "count_matches").Key<string[]>(key);

foreach (var x in view2)
{
  Console.WriteLine("Found {0} match(es)", x);
}

You can probably guess that
 
var view3 = client.GetView<int>("dev_names", "count_matches");

gives me total number of possible matches. Lastly, if you want to specify the scope of client, you can use a "using" block. I hope this post helps! 








 

Sunday, December 16, 2012

Couchbase on Windows 8

Getting Couchbase to work on Windows 8 is a little work, but not impossible. You can download an installer from www.couchbase.com/download; just use the defaults and you should be set. When the Console website pops up, use whatever settings make the most sense.

If you're having trouble with your copy, and your settings are fine, you might need to do a little hacking. Try this: download the Zip file from http://www.mediafire.com/?xc8nurnxjqr8klb , and unpack it to a safe place. You should find a DLL in the folder. Then, get into Windows Explorer or the Command Prompt as an Administrator, and replace C:\Program Files\Couchbase\Server\bin\libtcmalloc_minimal-4.dll with the new DLL. Restart the Service in Control Panel, and everything should work. I did this on a 2.0.0 install to get it to work. A fix might be in 2.0.2 and the next public release.

Lastly, if you want to use a .NET language with Couchbase, create your project, then get into the Package Manager console and enter

install-package CouchbaseNetClient

at the prompt. That should install the latest API into your solution. Later, I'll show you some ideas how to use Couchbase in C#.

Sorry for the earlier mistake!

Wednesday, November 14, 2012

Using CouchDB with Java and Domain Objects

This is a simple application using CouchDB in a Java program. I'm using the LightCouch JAR downloaded from www.lightcouch.org; the current version was 0.0.4 when I downloaded it. This version of my code uses a domain class; if there is enough demand (e-mail me at euric.reiks@gmail.com), I'll post the JSON version too.

First, I generated a Java package called "org.lightcouchtest4.oop" in my LightCouchTest4 project. The next step was creating a "properties" file in the root of my package called "couchdb2.properties". "couchdb.properties" is the default for this purpose, but I already had one. Here is what it looks like...


# To change this template, choose Tools | Templates
# and open the template in the editor.
couchdb.name=db-java4-oop
couchdb.createdb.if-not-exist=true
couchdb.protocol=http
couchdb.host=127.0.0.1
couchdb.port=5984
couchdb.username=
couchdb.password=

I hope these settings are self-explanatory. You might need to use different values in your file. Yes, LightCouch will create the database if it doesn't exist when you set "couchdb.createdb.if-not-exist" to true.

The next step is to construct the directory tree for the database views. These are sort of like stored procedures. In CouchDB, queries are created in design document JSON objects, with JavaScript doing the heavy lifting. With LightCouch, you just need the JavaScript. At the root of my package, I created two directory paths, /design-docs/example/views/by_all and /design-docs/example/views/by_name. In the first path, I created a map.js file containing


function(doc) {
emit(doc, doc);
}

This might look silly, but it enables me to retrieve all my documents of a given type at once. In version 0.0.4, this might be the only way to do it. The other path ends in a map.js file containing


function(doc) {
emit([doc.first, doc.last], doc);
}

This slight-of-hand enables me to search on a first name and a last name simultaneously. This program stores people's names, but first checks to make sure the name isn't already on file. What does the Name class look like? I'm glad you asked...


package org.lightcouchtest4.oop;
import org.lightcouch.*;
/**
 *
 * @author Ross
 */
public class Name extends Document {
    public String first;
    public String middle;
    public String last;
}

The Document class gives my Name objects an _id and a _rev. Adding a Name object to the database generates these attributes automatically. If you use Map<String, Object> to create your documents, you'll have to generate the _id's yourself.

Now for the main code.  Here is the whole enchilada:


package org.lightcouchtest4.oop;
import java.util.*;
import org.lightcouch.*;

/**
 *
 * @author Ross
 */
public class Main {
    public static void main(String[] args) {
        CouchDbClient dbClient = new CouchDbClient("couchdb2.properties");
        DesignDocument designDoc;
        designDoc = dbClient.design().getFromDesk("example");
        Response response;
        response = dbClient.design().synchronizeWithDb(designDoc);
        
        Scanner sc = new Scanner(System.in);
        System.out.print("First name: ");
        String firstName = sc.nextLine();
        System.out.print("Middle name: ");
        String middleName = sc.nextLine();
        System.out.print("Last name: ");
        String lastName = sc.nextLine();
        
        int dupCount;
        String[] keys = {firstName, lastName};
        dupCount = dbClient.view("example/by_name").key((Object[]) keys)
                .query(Name.class).size();
        
        if (dupCount > 0) {
            System.out.println("Found a copy");
        }
        else {
            Name name = new Name();
            name.first = firstName;
            name.last = lastName;
            name.middle = middleName;
            dbClient.save(name);
            System.out.println("New name was submitted");
        }
        
        List<Name> names = dbClient.view("example/by_all")
                .includeDocs(true).query(Name.class);
        System.out.println("\n Names list:");
        for (Name myName : names) {
            System.out.println(myName.first + " " + 
                    myName.middle + " " +
                    myName.last);
        }
        dbClient.shutdown();
    }
    
    
}

Creating a CouchClient instance connects the program to CouchDB using the settings in the given file. The designDoc variable helps LightCouch locate the design document files on the hard drive; the response variable updates CouchDB with those design documents if necessary. dupCount gets the number of documents whose data matches the entered name. (The Object[] casting is necessary.) If matches are found, a message appears on the screen. Otherwise, the entered data gets stuffed into a Name object and submitted to CouchDB. You might be wondering why the last query doesn't have a key. That omission tells CouchDB to retrieve all the Name-typed documents. includeDocs(true) says to grab the entire documents; without that, you would have to first get the _id's of the documents, then execute find() with the _id values to get the actual data. Lastly, as you might have guessed, the shutdown() method prevents memory leaks and disconnects the database.

This code uses Java 7 syntax. You might have to make modifications if you're using an older version of Java.




Friday, August 17, 2012

PHP-On-Couch

Hi, and welcome to CouchDB Tips! Today, we're going to do some PHP coding, using 5.4 syntax. So, lets get started...

We're going to write a web page that adds a name passed to it by a form, as well as creates and uses a view that employs a composite key. The calling form passes the first name as "first" and the last name as "last". The first thing we need to do is access the PHP-On-Couch library. I put mine in a folder called "PHP_On_Couch" under C:\inetpub\wwwroot. Yes, I use IIS.


<?php
require_once "PHP_On_Couch/lib/couch.php";
require_once "PHP_On_Couch/lib/couchClient.php";
require_once "PHP_On_Couch/lib/couchDocument.php";

Next, we're going to retrieve the user's input:


$first = $_REQUEST['first'];
$last = $_REQUEST['last'];
print "first = $first, last = $last<br />\n";

We need to connect to the CouchDB database. We're calling our database "php_couch3".

$client = new couchClient("http://localhost:5984", "php_couch3");

The next step is to create the database if necessary, and create our view.


if (!$client->databaseExists())
{
  $client->createDatabase();
  $view_fn="function(doc) { if (doc.firstname && doc.lastname) { emit([doc.firstname,doc.lastname], doc); } }";
  $design_doc = new stdClass();
  $design_doc->_id = '_design/all';
  $design_doc->language = 'javascript';
  $design_doc->views = ['by_name' => ['map' => $view_fn]];
  $client->storeDoc($design_doc);
}

This view checks if the document contains an actual name. If it does, it combines the first and last names into a composite key. The next thing we'll do is display all documents whose contents match the input and count them.


$opts = ['key' => [$first, $last]];
$response = $client->setQueryParameters($opts)->getView("all", "by_name");
print_r ($response);

print "<br />\n"; 
print "Found " . count($response->rows) . "<br />\n";


Now, we're going to set up an "auto_increment" and add the new data.

$count = count($client->getView("all", "by_name")->rows);

print "count = $count<br />\n";

$name = new stdClass();
$name->_id = strval($count + 1);
$name->firstname = $first;
$name->lastname = $last;
print "Adding name...<br />\n";

try {
$response = $client->storeDoc($name);
} catch (Exception $e) {
print "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
exit(1);
}

The last thing we're going to do is list all the names in the database.


$response2 = $client->getView("all", "by_name");
foreach ($response2->rows as $name2)
{
# print_r ($name2);
        print $name2->value->_id . ": " . $name2->value->firstname . " " . $name2->value->lastname;
        print "<br />\n";
}
?>

And that's it! I hope you enjoyed this tutorial. Direct all questions to me (Ross Albertson) at euric.reiks@gmail.com. You can find PHP-On-Couch at https://github.com/dready92/PHP-on-Couch/.  I might do C# next.