Sunday, June 9, 2013

The Model is a model (not a table)

Every website has its own style or quirks that make it unique so code reuse and abstraction are really important to web programming. I would argue that making code manageable is more important in web programming than making it efficient. There are probably particular pieces of your site that MUST be efficient but there are 3rd party technologies to help with that and many start-ups have had to rethink and reiterate their core product. Programmer time is one costliest things for web production so cutting the time it takes to alter and maintain code is really important not only for the product's sake but also for the economic gain. This is why MVC (Model View Control) structure is been so widely adopted. There have been many MVC frameworks that have come out and various tweaks on each piece of the MVC puzzle. Today we will talk about the ORM (object relational model) a wrapper/structure for the model part of the MVC and why is sucks.

ORM Sucks

.....what what what .... ORM is awesome ... it makes my code so organized ...

No it doesn't ... and its most useful part can be done in about 50 lines of code.

An ORM creates classes based on your table structure so that you can interact with PHP code to alter and retrieve data from your database. By using an ORM you can greatly reduce the amount of sql queries you write by using the classes that are generated for you. Symphony is a popular MVC framework for PHP, it is very robust and has about every common class and tool that is used in web development. It comes packaged with the Doctrine ORM.
Here is an example of a simple use case of an ORM class:

$query = $repository->createQueryBuilder('p') ->where('p.price > 19') ->orderBy('p.price', 'ASC') ->getQuery();

And here is the query it generates:
SELECT * FROM Product AS p WHERE p.price > 19 ORDER BY p.price;

For this particular example using the ORM object makes creating the query longer which is just a waste. Also the query will have to be generated which causes some overhead. Third there is a paradox with using an ORM. If writing a simple query is too stressful for a programmer then why would you trust them with the database work anyways. Knowing how to access data and the right way to access data is an important distinction. I would not trust anybody to work with my database code if they could not do it without an ORM. Number 4: this is cute for simple queries but when you have to do complex joins and sub-queries you will have to create it yourself because this kind of stuff is too specific for ORM to generate (Doctrine has a query function that lets you pass in a string for just this occasion). These type of queries are usually tied to core features that combine your tables in meaningful and powerful ways. Its these type of queries that require a knowledgeable database person, so hopefully you have one that is able to work outside the ORM.

ORM in 50 lines of code

To me the ORM gains you little and keeps you from learning to make good queries (like training wheels), but there are two reasons for ORM use. One is that you just don't have that database knowledge in house yet and you want to be able to build quickly so you put that work on the back burner and go as fast as you can on training wheels. The second more important reason is because you want to eliminate spelling errors which do occur when you write hundreds of queries by hand but these are easily catchable/fixable.

So as I said before that the ORM stuff could be done in about 50 lines of code, here is the PHP code to prove it:

//insert a piece of data function createInsert($table , $data){ //for each item in the data array create the query strings $columns = array_keys($data); $column_str = '('.implode(',',$columns).')'; $value_str = 'VALUES('.implode(',',$data).')'; $query = "INSERT INTO ".$table." ".$column_str." ". $value_str; return $query; } //select a piece of data function selectData($table , $cols, $where){ $columns = implode(',',$cols); $query = "SELECT ".$columns." FROM ".$table; if($where){ $query .= " WHERE ".$where['col'].$where['compare']. $where['value']; } return $query; }

There are of course many other simple little functions to create and you might point out that these don't "clean" the data or accept multiple items in the where clause. Those issues can be easily fixed. As you can see it is not too difficult to generate these simple queries and mitigate your query spelling errors. For the difficult stuff you still should create a custom query but for the simple stuff just a couple of little function helpers will work. There is no need to build classes for each table and manage XML or YML config files. Don't be scared of your database, in time you will learn it but first you have to look inside.

Beyond the Data

No matter if you use an ORM or queries or stored procedures this is not the last step that you must perform to display the results. Often times a new business requirement will arise that will demand your data be represented in a different way or compared with new data. This is the true job of the model, or this is the job that your model is abstracted out for. A good model is a reusable chunk of logic that interacts with your data to save and display a feature for your business. You may have this feature around many places in your application and you might also have this feature affected in conjunction with other features (a marketing mail updater that is used in registering, purchases, surveys, etc.). Encapsulating the data interactions away from the controllers that display these features will allow for code reuse and will further separate the data away from the presentation.

Here's what the general call stack will be in regards to an MVC framework:

routing for url (client question)
controller setup (loading libraries , reading session)
model initialize
ORM / stored procedures / custom queries
model data processing
controller logic (handle data base / edge cases)
view (server response)

An implementation example would be that instead of passing around a function through out your controllers that says "saveUserData" you would have a "Register" model that has an "authorize" function calls the saveUserData function in your query library. Now if you want to add functionality the "authorize" process (mailing) you only have to do it in the "authorize" method and not in every controller where a user can be registered. Instead of duplicating calls to the database you duplicate calls to features which access the database in the form that the feature needs to.

Discussing code so generically always seems like a trivial task that ends up wasting a lot of energy and not producing anything. I will not fool you to think I have produced anything here, I have not. Every product is different and every product has specific features that have an optimal implementation. Then a new request comes in and you end up having to refactor a class, create a new parent or reconfigure the schema of your database. An optimal implementation only lasts as long as that instance of the project lasts. Eventually (daily) you will receive a new feature/add-on and realize that you need to refactor, this is why it's so important to keep things manageable.

This happens a lot in the web environment so my above suggestion is not to optimize a product but optimize the process of creating products. By using a MVC you can keep the data(model), and the visual representation(view) separate which makes representation maintainable. I took this a step further to explain that using feature classes (model) keeps your feature access(controller) and the storage library (database queries) separate which makes logic/feature processing more maintainable.

You can use an ORM but an ORM will not solve the problem you think an ORM is solving. An ORM is just a different way to write an inline query. You want to abstract this out and make a function/class interface so that if the underlying logic changes you don't have to alter every bit of code that uses this logic (just the interface function). If a programmer requires an ORM or if the programmer cannot write a query that cannot be generated by an ORM I would suggest not letting that person write any of your database code.

No comments:

Post a Comment