Source Controlled, Scalable, Load Balanced Deployment With Drupal Can Be Fun!

Josh Lind's picture

I'm convinced that having a speedy load balanced site with a static CDN and maintaining a sane workflow in Drupal is no longer difficult. I believe this even though I've never set it up end-to-end, just pieces with various projects. But it's 2011 and IT isn't hard anymore right? It's a consumer type experience to roll out cloud servers... and even to roll out cloud hosting services. So I believe that managing a codebase and having a deployment strategy without creating an IT-lorded-over black box production should be pretty simple. I'm going to take a stab at how I would do this and see where I need more answers.

Here are the buzz words for what I'm trying to do: Git, PHP, Multisite, Deployment, Load Balancing, CDN, Capistrano.

Manage Your Code

  • Clone from a repository agreed as the working master. Keep this in a public place so people can easily get access, like a third party service whether free or private. Push back to this master, make atomic commits, share, branch, etc. Be sure it's agreed that the master branch needs to be production ready, but that it will be tested on staging before it's pushed live.
  • Pull into a staging server for QA. (* You may also need to merge special updates on production back. This should be avoided, but read on for potential situations where you may choose to do this.)
  • Use Git ignore to not track your user generated folders to prevent movement to production and deletion. The .gitignore file lives in the the root of your Git directory along side the main .git file. It merely contains a list of files and folders that should not be included in the repository.
  • You may also choose to ignore .htaccess or settings.php files for piece of mind from human error. *

Use Multiple Environments

  • Setup staging and development multisite database configurations (with settings.php) by creating several folders within the /sites directory other than default. Use a folder name that corresponds to how it will be accessed. (For example: "localhost/mysite" or "mysite.local" or "dev.mydomain.com")
  • You can also choose to alter some of your settings as well for easy debugging. Make the following changes to your settings.php file within your local environment...

    $conf = array(
    'cache' => FALSE, //page cache
    'block_cache' => FALSE, //block cache
    'preprocess_css' => FALSE, //optimize css
    'preprocess_js' => FALSE, //optimize javascript

    You will probably also want to turn on devel on the local machine. Right now I have to do this with a drush command on each refresh...

    drush en devel

    Need help with Drush? Take Morten's help http://morten.dk/blog/got-crush-drush
  • Use Git ignore when appropriate to prevent clutter or pushing these folders unless you all want to use the same dev namespace.
  • Haven't tested this yet but, but there may be a way to use a different user uploaded files directory for local work. Then you don't have to ignore your production files directory because you won't be changing it on your local environment. However, the files directory written out by Drupal may be wrong. Thoughts?

Handling Code Deployment

Config and Content Deployment

  • One of Drupal's Achilles heels (trade offs for strengths has made them) is lack of an official way to manage configurations between environments. In my opinion the main reason is that Drupal was designed to allow in-place administration, removing the need to write code to develop a site. Because of this configuration is mixed with content and on many sites a line can't be drawn with examples like taxonomy, saved filters, block layout, global content and theming. Needless to say this is a hot topic right now (2011); if you want to know more check out Butler
  • Arguably the primary solution to configuration AND content deployment (one way) is the Features module. For a better explantion that I can give read this DevSeed post. Be advised this also requires Context, which is just a good tool anyway.
  • There are definitely other ways to accomplish config and content deployment for less complex setups like: Node Export (moving things between sites), Node Import (bulk data importing tool), Node Import via Cron (schedule it), Menu Import (pull menus through), etc.
  • Additionally, there are less front and center deployment solutions, for example the Deploy module and the add-on Incremental Deploy

Split Production Code Hosting

Split Father With Static File Server
There appears to be a few options on this...

  1. Use settings.php to set your static URL and use this in your theme, http://xdeb.org/node/1221
  2. Manage this on the server level by mounting an external server for files, read more here http://www.troubleshooters.com/linux/nfs.htm or here http://tldp.org/LDP/nag2/x-087-2-nfs.mountd.html
    Some rewrite rules will also be necessary, either in .htaccess (or apache if you've moved them there.)
  3. Use one of the CND modules. I haven't dug into these other than using the CloudFront module for imagecache files.

Split Production Database

  1. Separate off specific tables into another database. This is normally intended for sharing tables between sites for the purpose of single sign on, but is a valid way to keep things like your accesslog in a different database because it can grow so large. When appropriate here are some examples of good things to break off depending on your specific needs...
    • stats and access logs - accesslog, history, simplenews_statistics (etc.)
    • nodes - node, node_revisions
    • appropriate fields - contnent_field_xxx
    • cache tables - cache_content
    • access permission - node_access, users_roles (content access is quite the performance hit.)
    • flags - flag_content, flag_counts
    • aliases and redirects - path_redirect, flag_counts
    • core profiles - profile_values (though many folks use content profile.)
    • aliases and redirects - path_redirect, url_alias
    • search - search_index, search_dataset, search_node_links, search_total (though SOLR is the right thing.)
    • sessions - sessions
    • taxonomy - term_data, term_node, term_synonym
    • commerce - uc_orders, uc_payment_receipts, uc_product_options (if you're lucky!)
    • voting - votingapi_vote

    Here is the resource on using settings.php to split tables off... http://drupal.org/node/291373

  2. Drupal 7 allows for master slave database configs, but I don't need that one yet.
  3. mySQL also allows for partitioning, though I kinda prefer the Drupal CMS approach. Here is resource about that... http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
  4. Break some tables off into a more scalable database technology like: Cassandra (D6), MongoDB (D7), Couchdb (D6-dev)
  5. Additional Performance Improvements

    • Set your expire times way out there and ETags with .htaccess (or apache if you've moved them there.)
    • Turn on page caching, CSS caching, JS caching, block caching. More details...
      • Turn these off on development with commands like: drush vset cache 0
      • Use Block Cache Alter or even AJAXify Regions to fine tune block caching.
    • Setup Varnish for static serving of pages and/or use Boost (which is easier but relies on .htaccess)
    • Setup MemCache for memory based serving rather than database calls.

    Running Updates

    • Of course, always test the drush updates on a development environment.
    • Your options...
      1. BEST: Commit code updates on development, push through normal route. Set site to full static cache mode (boost), maintenance mode, or read only mode and then run a database update (drush updatedb) on production.
      2. OK: Reduce your mirrored balanced servers to one. Run update on a production server, commit the code and pull code backward to staging and then farther back to working dev copy.
      3. MOST CAREFUL: Put site into special mode (see above), refresh staging database from production, run drush updates on staging, commit code, backup database, push BOTH code and database to production. Take out of special mode.

    Refreshing Environments

    • I currently use a sad list of 6 commands to refresh dev and staging. These can easily be turned into a shell script, but I haven't done it yet. Yes this is embarrassing. Here is what they look like (assumptions: using MAMP with symlink to ~/Sites or local Apache, keeping Drupal in a sub-folder of the project directory, your hosts file has a local url of SITE.local)

      scp -v username@SERVER:/backups/database/db_drupal_MM_DD_YYYY.sql.gz /Users/ME/Sites/SITE/db.sql.gz
      gunzip /Users/ME/Sites/SITE/db.sql.gz
      mysql -h127.0.0.1 -uroot -proot < /Users/ME/Sites/SITE/db.sql
      rsync -avz --include "*/" --include="*.jpg" --include="*.swf" --include="*.png" --include="*.gif" --exclude "*" username@SERVER:/backups/code/drupal/sites/default/files/ ~/Sites/SITE/drupal/sites/default/files/
      cd /Users/ME/Sites/SITE/drupal
      drush en devel --uri=SITE.local
      drush cc all --uri=SITE.local
      drush en permissions_api --uri=SITE.local
      drush perm-grant 'anonymous user' 'access devel information' --uri=SITE.local

      These commands need to be run every time you want to refresh dev and start working.
    • You can also use the Environment module to automate enabling/disabling modules. For example disabling LDAP and enabling Devel.
    • Some folks have used Capistrano for this too. http://www.blog.bridgeutopiaweb.com/post/capistrano-task-for-loading-pro...

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
You must be a member of this group in order to post a comment.