Peter Marklund's Home |
Peter on Software
Lessons Learned in software and web development Subscribe
Technical Home Assignments
Lately I've been staying busy with technical home assignments for Software Developer jobs that I've applied for. It's mostly simple REST API:s in Python or Node.js but I've learned a thing or two along the way. For one thing I've come to appreciate the FastAPI Python framework and the type hint based validation and OpenAPI integration that it offers.
- Music Recommendation API - Python/FastAPI/OpenAI/pgvector
- Metrics Dashboard - Node.js/Fastify with websockets and React
- Movie REST API - Python/FastApi/Postgres deployed to GCP
- Visitor Location Map - React/Google Maps with Python/FastAPI deployed to Heroku and Vercel
Realtime Replication of CMS Data from MongoDB to S3
The Maintenance Burden of Microservices
Autonomous teams is a popular idea in the industry today that tends to be associated with microservices. Autonomous teams fully own their services and are trusted to make their own decisions. The idea is that this will make them more motivated and productive. I fully support this idea. There are many other factors that drive microservices adoption such as scalability, performance, fault tolerance and the architecture of cloud infrastructure (i.e. the emergence of serverless functions etc.). However, multiple challenges arise when a team goes from owning a handful of services to owning dozens. Unfortunately I believe that when a team splits up its services on a micro scale it is setting itself up for a heavy maintenance burden.
When creating a new microservice we as developers tend to feel really good about ourselves and we can be quite productive. We enjoy how seemingly isolated and modular microservices are and we enjoy doing greenfield development. If we are lucky we may even get to use our favorite language, framework, coding conventions, or infrastructure. We feel free as we are seemingly no longer bound by legacy systems. Let's imagine a difference scenario down the road though where we find ourselves having to do maintenance of a large number of microservices that have accumulated over several years and generations of developers with different preferences - many of which have left. This scenario is obviously not quite as attractive...
Here are some challenges with microservices:
- Duplication and boilerplate. All the code that doe not constitute the essential business logic of your services will tend to get duplicated each time you create a new microservice. One can take steps to reduce this boilerplate of course but in practice there tends to be a lot of it and this problem tends to grow over time.
- Weak integrity. When working within a single app we are assisted by the language, the IDE, the compiler, and the tests to ensure the integrity of function calls across modules. When sending messages between microservices over the network (i.e. with REST API calls or message queues) maintaining that integrity is much more difficult. Also, any changes to a REST API or a message needs to be backwards compatible. This puts a significant constraint on our ability to change and refactor the system over time.
- Difficult debugging. You typically don't have a stacktrace when debugging errors in microservices environment and this obviously makes debugging more challenging.
- No system-level regression testing. Microservices lead to lower end-to-end system level test coverage and you can therefore not be as confident that the system as a whole continues to work when you make changes. Keeping the functionality of the entire system intact is obviously what matters to end users and thus to your business. Developers will typically test microservice in isolation (and possibly in a shared staging environment) and then ship them to production and rely on monitoring to ensure that they will work well. The ability to spin up an entire environment every time you make a change is typically not there. Unfortunately, the independence of microservices is often an illusion as microservices tend to depend on each other in many unspecified and undocumented ways. For example, to scale one service you may need to scale its dependencies and one microservice failing can cause other services to fail (unless good circuit breaking is in place) etc. Also it’s quite common to see microservices being integrate via a shared database or file storage even though this is supposed to be strict no-no.
- Learning curve. As a developer you may struggle to see the full picture of the system as it's not expressed in code. If you are luckky you may have access to a few architecture diagrams but they can typically not be relied on to be comprehensive and up-to-date. The fact that services get built with different languages, frameworks, different versions of libraries, different directory structures etc can be a blessing but also a curse. It leads to duplication and a much greater learning curve and cognitive load for developers.
Let's review the action points and decisions involved in creating a new service:
- Choice of programming language, frameworks and libraries and their versions and how to keep those up-to-date
- Coding conventions, folder structure and naming conventions
- How to do testing, linting, and building
- How to do configuration (where to store secrets, env variables, config files etc.)
- How to do logging
- How to setup deployment and infrastructure (including choice of cloud provider etc.)
- Setting up a build pipeline (including potential choice of CI provider, i.e. CircleCI or Github Actions etc.)
- How to do monitoring
- How to do documentation (README files etc.)
Now suppose you want to make a change in any of the areas listed above. Let's say you want to change cloud provider, or build pipeline, or the framework that you use, or upgrade the version of your programming language. Instead of being able to make this change with one or a handful of PRs you end up needing to make dozens of PRs. Or maybe you don’t have time to create all those PRs and then your services grow more inconsistent over time and this is the most likely scenario.
Let's think about everything that is not core business logic in a service. In other words let's think about the incidental complexity, the implementation details, and the boilerplate. To make things concrete I will use a Node.js API hosted on Github and deployed with Docker on AWS as my example. Here is an incomplete list of boilerplate for such an API:
-
package.json
- scripts for testing and running the server and all libraries and frameworks and their versions -
.npmrc
- configuration for the package manager -
nvmrc
- node version -
.env
- env variables -
config
- configuration files -
Dockerfile
- Docker configuration -
.gitignore
- git config -
.dockerignore
- Docker configuration -
.circleci
- build pipeline -
.eslintrc
- linting -
jest.config.js
- test config -
cdk
- infrastructure/deployment code or config (can be thousands of lines) -
server.js
,routes.js
- Code to start a web server and do routing etc. -
db.js
- code to talk to the database -
swagger.json
- API documentation
What is the ratio of essential business logic code to boilerplate in a microservice? Well that varies but it's not uncommon for there to be at least as much boilerplate as business logic.
As a small development team I think the goal should be to only be maintaining a handful of services. Sometimes architecture will force us into having more services or the team may have too much on its plate but we should try to avoid that if we can.
In this post I've mostly talked about the duplication problem but another important aspect is that microservices comes with distributed computing and a fundamental increase in complexity. Ironically this may actually be one reason why developers are attracted to this architecture. After all, developers tend to be drawn to complexity and the challenge of solving hard problems.
Source Code needs to Carry its own Weight
How often have you as a software system come across a system that is really well designed and well tested and that isn't also over engineered in some ways?
How often is there a cost/benefit consideration when we introduce a new abstraction or levels of indirection in our software? At the time of writing the abstractions it feels like all benefits and i all makes perfect sense. There is no cost other than the time it takes to write the code and as developers we can write code really fast. New developers who discover the abstractions down the road may have very different feelings about them though.
Let's say you are building a REST API on AWS Lambda with a database like PosgreSQL or MongoDB. How many layers of abstraction do you need for a system like that? Should you abstract away the cloud provider, the database, SQL, and HTTP? Should you build those abstractions regardless of the likelihood of change? Sometimes we seem to be suffering from the "not invented here" syndrome and we seek to shield and cushion our delicate code from the capricious outside world. One maybe overlooked advantage of directly using a standard interface (like HTTP, or SQL, or even AWS Lambda) is that it tends to be well specified, documented, battle tested, and understood. That is typically not the case with custom/proprietary/internal interfaces and abstractions (interfaces that we create to wrap standard ones). Suppose you abstract away your cloud provider or database and that the likelihood of that dependency changing is 1%. You will then be paying the cost of the abstraction in increased complexity and code size in 100% of future scenarios. Only in the 1% scenario will you possibly reap the benefit of the abstraction making the technology change easier. But even in that case you will probably end up needing to do some work to accommodate the change and you haven't necessarily benefited from the investment.
Let's remind ourselves of the "You aren't gonna need it" (YAGNI) and "do the simplest thing that could possibly work" principles. Maybe instead of investing in speculative generalizations up-front we should build the abstractions and adapters once we actually need them? Maybe in the future once we have the concrete needs and use cases we will better know what to build.
Learning From Your Bugs
- Precise names can help prevent bugs. Once I changed the argument name from "name" to "default_name" the bug became more obvious
- Configurability/flexibility of software drives complexity and may result in unused/untested branches and those are often a source of bugs. In my case the ability to override the model name was simply a feature that I wasn't using much so it was not well tested and thus the bug didn't show up immediately.
- Bugs usually happen in the periphery of your application, i.e. in the less tested and used features or in the edge cases
- Silent failures/missing assertions
- Too loud failures/missing error boundary
- Missing null checks (being defensive and checking for missing data at runtime). Your static types and compiler may indicate that a property cannot be null but that doesn’t guarantee that it’s not null at runtime - an external API may for example feed you data that is incomplete or invalid.
- Missing timeouts, too short or too long timeouts
- Missing retries (network issues etc)
- Not good enough validation
- Not good enough testing
- Too tight or too loose regular expressions
Re-implementing this Blog again - Node.js Without Dependencies + Clojure CMS
Back in the day I originally built this blog on Ruby on Rails. I remember upgrading the app several times between different major versions of that framework. I couple of years ago I re-implemented the blog with Clojure and the Luminus web framework. I wasn't particularly impressed with Luminus but I certainly enjoyed Clojure. While working at C More last year I built an open source CMS. I also implemented a couple of APIs on Node.js where I attempted to minimize library dependencies. I found this to be a really good learning experience. As for this blog it ended up being around 1000 lines of Node.js code with zero dependencies other than Node.js itself. Obviously the majority of that code is functionality that would normally be provided by frameworks and libraries such as routing, http, templating etc. The templating code that I use is based on John Resigs Micro Templating script from 2008 with the addition of default HTML escaping and the ability to nest templates (i.e. includes).
The backend part of this blog is handled by a MongoDB based REST API with documentation provided by Swagger. That Swagger specification is used to dynamically generate an admin UI built on Vue.js. A generic admin UI is possible due to the swagger data that contains JSON schema for the blog posts model with additional metadata like which fields are relevant in the admin UI and which form fields to use etc. As a backup solution have an after save callback in my blog posts models which stores the JSON data on Dropbox.
Using Clojure Multimethods for Polymorphism and Inheritance
It's fascinating how simple and powerful the multimethod feature in Clojure is. It provides a way to support polymorphism and inheritance that we are used to from object oriented languages:
Clojure on Heroku Gotcha - PostgreSQL, Luminus, and JDBC
As I rewrote this blog in Clojure using Luminus one of the main stumbling blocks was getting the PostgreSQL connection working on Heroku. It turned out that the Java DriverManager
that the clojure.java.jdbc
library talks to can't handle the Heroku DATABASE_URL
directly as it contains username and password. However, if you pass the db specification as a string to clojure.java.jdbc/get-connection
then the clojure function will parse the database URL for you and pass on the proper credentials on Java. This turns out to be well described in the Heroku documentation (see below) it's just that :connection-uri
that the Luminus uses by default (as of this writing) doesn't work on Heroku.
Clojure - a Ring www Redirect Middleware
Something I really appreciate with web development in Clojure is the Ring HTTP API and especially how requests and responses are associative data. This is of course a reflection of the data orientated nature of the Clojure language. It's also elegant how you can use the Clojure Thread first macro to chain middleware together. A middleware takes a Ring handler as its first argument plus any additional arguments that configure the handler and returns a new handler function. A Ring handler is a function that takes a request map and returns a response map. I guess you can't make it much simpler than that.
Here is an example of a simple www-redirect middleware:
Clojure on Heroku Gotcha - Missing SSL certificates and jdk-overlay to the rescue
In order to get my Clojure app on Heroku to talk to the Geckoboard HTTPS REST API (at https://push.geckoboard.com/v1/send) I needed to add the file .jdk-overlay/jre/lib/security/cacerts that I copied from my local Java installation (it's under $JAVA_HOME). Apparently it's due to licensing issues that Heroku can't include the necessary certificates in their Java installation. For more, see Customizing the JDK at the Heroku dev center.
Here is an example REPL session that reproduces the problem:
(require '[clj-http.client :as client])
(client/post "https://push.geckoboard.com/v1/send" {:body "some body here"})
SunCertPathBuilderException unable to find valid certification path to requested target
sun.security.provider.certpath.SunCertPathBuilder.build (SunCertPathBuilder.java:145)
Launching new Website
We+
This autumn I have joined We+ - a new exciting startup in the exercise and health area. Our product uses the power of small groups and positive peer pressure to promote exercise among employees. We have already built a first version of the system and done Alpha testing and this week we are kicking off a pilot with two large swedish companies. Exciting!
Creating and Deploying an EdgeRails (Rails 4) Application to Heroku
I wanted to create a new Rails app based on EdgeRails (Rails 4) and I didn't find much when I googled around so I ended up creating a Gist on Github and answering the corresponding question on Stack Overflow.
I'm really happy to see how Rails continues to evolve through thousands of small improvements, many of which are features which I know would have been useful in projects I have worked on in the past. I hope to be able to cover some of the new stuff in Rails 4 later.
Explore the Future of Tablet Publishing With Mag+
Course Material for a Two Day Introductory Ruby Course
I'm giving a series of two day introductory Ruby courses to C++ programers at Ericsson here in Stockholm and I've made the course material available on Github. The course material includes slides (keynote and pdf) as well as Ruby code examples. For the exercise part I am relying on the Ruby Koans and they have been much appreciated.
I enjoy being back in the teacher role and I hope to be able to teach courses like this many times again in the future.
Rails Tip: Inserting NULL to the Database Instead of Empty Strings
The value NULL in a relational database represents the absence of a value. Empty text fields and text areas in HTML forms on the other hand get submitted in Rails as empty strings. This means you can easily end up with empty strings in the database where you would expect NULL values. I came up with the following workaround for our ActiveRecord models:
Rails 3.0.3 Backwards Incompatible for File Uploads
We upgraded from Rails 3.0.1 to Rails 3.0.3 yesterday and sadly this broke file uploads for us in production. What do we conclude from this other than that we are missing integration tests for file uploads? Well, one noteworthy thing is that the Rails 3.0.3 release is not quite as backwards compatible as announced, at least not when it comes to file uploads. In Rails file fields in multipart forms are exposed in the params hash as an ActionDispatch::Http::UploadedFile object. As of a commit by tenderlove on the 5:th of October this class no longer inherits from Tempfile but instead delegates to Tempfile (i.e. tenderlove favored composition over inheritance like the GoF prescribes). Unfortunately, the UploadedFile class only delegates a handful of methods and not two important methods that we happened to be using, namely open and path. So now, for example when getting the path of the tempfile we have to fetch the tempfile first:
params[:my_file].tempfile.path
Testing Rails Migrations
Is it worthwhile testing Rails ActiveRecord migrations? After all, they are only intended to be run once so regression testing isn't an issue. I honestly haven't tested my migrations much in the past but I recently decided to give it a try. I was surprised by the fact that it wasn't very different from testing any other part of my application. My test didn't end up having very good coverage so I still needed to test the migration manually. As usual when writing tests, I found that it drove a series of extract method refactorings. I went from having all code in the up method to having five shorter methods. A different approach to migration testing is to add sanity checks at the end of migrations that output an error message in production if the outcome of the migration wasn't what was expected.
Heroku Deploys With Rollbacks and Changelog
One of the features I miss from Capistrano is the ability to easily do rollbacks when deploying to Heroku. What you can easily do though is git tag your releases and then do a rollback by pushing the previous release tag. I've created a RubyGem called heroku_release that does this. The gem has a few additional features such as the ability to generate a CHANGELOG file from the release tags and their comments. I also use it to generate a version file so that I can check on the live server what version of the code it is running.
It's interesting to note that Heroku is apparently working on supporting release management and logging - two problems I have ended up rolling my own solutions for recently. I look forward to seeing what they have come up with.
Ruby Debug Printouts
It can get really tiresome and repetitive to do debug printouts with puts. I've created a simple RubyGem called debug_log that gives you a convenient way of evaluating and printing variables and other Ruby expressions that you want to debug. Here is an example:
It's funny how I created this gem pretty much at the same time as Niclas Nilsson created his dp gem. I owe the approach to patching the binding object to Niclas. I think that is a beautiful solution as it avoids you having to pass the binding object as an argument.
Ruby Testing: Avoid Stubbing Non-Existent Methods with Mocha
Mocks and stubs can be fragile and come back and bite you when they can get out of sync with your code. One way this can happen is that a method is renamed or you misspell the method name. To avoid this issue you can configure Mocha to disallow stubbing and mocking of non-existent methods. I've come up with two new methods for the cases where you are working with messages implemented with method_missing:
It is interesting to note that this extension of Mocha is possible because of its flexible and clever design. Mocha was recently updated to version 0.9.9. I am grateful to James Mead (Floehopper) for providing this excellent testing library.
Request Log - RubyGem for Logging Rack (Rails) Web Requests to MongoDB
Prompted by the fact that Heroku doesn't keep the Rails request logs around I went out looking for a logging solution. What I've ended up with is Request Log - a simple RubyGem for logging web requests to MongoDB.
My experiences with logging to MongoDB so far have been very positive. I see big potential in logging web requests to a database. The reason MongoDB is so well suited for the task is its high performance and strong query capabilities. This allows you to do advanced queries such as "give me all requests in this time period, with this response, status, this execution time, these parameters etc.". Each web request becomes a document in MongoDB and if you choose your database fields wisely you have a great tool at your disposal for statistics, monitoring, and debugging etc.
I'm curious to see how we'll be able to design and use our web request logs in the project I'm currently in. I'll report back here any interesting findings that we make.
Rails Presentation: Minimizing Library Dependencies
I gave a presentation tonight entitled "Minimizing Library Dependencies" at the Stockholm Ruby User Group (SHRUG) meeting. The event was hosted by MediaPilot and sponsored (with beer) by Auktionskompaniet and it turned out to be a huge success with 73 registered attendees, great presentations and atmosphere. I talked to David Wennergren about hosting the next meetup and our ambition is to have about one per quarter. It's great to see the community coming to life again!
The slides for my presentation are hosted on Github and are also available here.
Migrating RSpec to Mocha
Over at the MyNewsdesk developer blog:
PostgreSQL Unreliable Default Sort Order and Random Rails Test Failures
More good stuff from the MyNewsdesk Developer Blog:
Rails 2.3.2 Bug and Rails Cache Enable/Disable
Two new posts from the MyNewsdesk Developer Blog:
New Plugins For Rails Model Caching: Cachable Model
As part of an ongoing effort to offload out database at MyNewsdesk I have released a new Rails plugin called Cachable Model. The plugin is similar to an older plugin called Cached Model in that it basically caches primary key id lookups for ActiveRecord models. The Cachable Model plugin has an extra feature that lets you cache lookups by other unique columns as well. Here is an example:
Newsdesk Developer Blog - Back in Business
Now that I'm back working at Newsdesk I've started posting in our developer blog again:
Book Tip: The Mythical Man Month
Fred Brooks is a Computer Science professor who managed 5000 man-year IT projects at IBM in the sixties. His words carry a lot of weight and he certainly has something profound to say about software engineering. The central theme of the book is that of conceptual integrity. It's about the need to have a single architect, a master mind, who oversees all development and makes sure all the parts fit together. "A clean, elegant programming product must present to each of its users a coherent mental model of the application". The ideal scenario for conceptual integrity naturally is having a single programmer. The problem is that some systems are so big that in order to finish them before they are obsolete you need a large number of developers. Much of the book is spent discussing this difficult problem. How do you organize huge developer teams?
Something I found myself wondering as I was reading the book is how large scale open source projects are able to organize themselves and how they differ from commercial projects. The Wikipedia article on Brooks law (i.e. adding people to a late project makes it later) suggests that open source projects scale through "Efficient parallelization of work, reducing the communication overhead" and through having a large number of testers.
One of the ideas in the book that resonated the most with me is that of the incremental build model. You start out by building an end-to-end skeleton system, i.e. system with a small subset of the complete functionality. This sounds very much like the Tracer Bullet approach advocated by the Pragmatic Programmers. The benefit is early user feedback and exploration of the users needs.
According to Brooks, "...the quality of the people on a project, and their organization and management, are much more important factors in success than are the tools they use or the technical approaches they take.". Brooks refers to the Peopleware book when making this point, i.e. it's about the physical and social work environment, about aligning and motivating developers etc. However, it also ties in with the idea of conceptual integrity, i.e. you need to solve the communications problem and make sure that the left hand knows what the right hand is doing.
Brooks offers a great analysis on the problems of software maintenance and fixing bugs:
"The fundamental problem with program maintenance is that fixing a defect has a substantial (20-50 percent) chance of introducing another. So the whole process is two steps forward and one step back.
Why aren't defects fixed more cleanly? First, even a subtle defect shows itself as a local failure of some kind. In fact it often has system-wide ramifications, usually nonobvious. Any attempt to fix it with minimum effort will repair the local and obvious, but unless the structure is pure or the documentation very fine, the far-reaching effects of the repair will be overlooked. Second the repairer is usually not the man who wrote the code, and often he is a junior programmer or trainee."
Brooks ends his essay with these dark words:
"System program building is an entropy-decreasing process, hence inherently metastable. Program maintenance is an entropy-increasing process, and even its most skillful execution only delays the subsidence of the system into unfixable obsolescence."
The most famous essay in the book is No Silver Bullet — Essence and Accidents of Software Engineering. According to Brooks, programming at its essence will always be a complex, time consuming, and error prone thought process, and therefore, despite advances in technology, we will never see the kind of explosive productivity growth in software that we have seen in hardware.
This is just a small teaser of what the book has to offer. If you are at all interested in the methodology and management of software development I highly recommend checking it out.
Teaching a Three Day Ruby on Rails Course in Rome
This weekend I taught a three day Ruby on Rails course here in Rome in Italy. It was a great experience and the people down here have shown the greatest hospitality and have taken very well care of me.
The format of the course was like a workshop with a small group of participants in a private and relaxed setting. I used my course material as a starting point and a road map but then improvised a lot and ended up doing a lot of hands on programming. Basically the course was divided into three parts:
- Ruby and Rails fundamentals. A lot of time was spent on the programming language itself and we toured the features of Ruby by live coding in TextMate.
- Writing a demo application. Similar to the AWR book, I built the basics for a store application, including file upload, advanced ActiveRecord associations, and deployment to a VPS with Capistrano.
- The participants got to work on different features in the application. Time was also spent code reviewing the participants own Rails applications.
Overall I must say the course was quite a success. Every time I teach a course I look for ways to make my courses more interactive, more hands on, and more tailored to the needs of the participants.
I also had a great time outside the course in Rome - a city that I fell in love with immediately.
Rails Counter Cache Updates Leading to MySQL Deadlock
I've gotten a few error messages lately where a plain vanilla ActiveRecord counter cache update (update_counters_without_lock method) has lead to an error being thrown from Mysql - "Mysql::Error: Deadlock found when trying to get lock; try restarting transaction: UPDATE `events` SET `attendees_count` = COALESCE(`attendees_count`, 0) + 1 WHERE (`id` = 1067)".
It seems someone else has tried to report this as a bug but Mysql is saying that it's a feature and is referring to the lock modes documentation. There is some interesting info on deadlocks in InnoDB over at rubyisms. I haven't had time to dig into the theory though. Has anybody else had this issue? What can be done about it (other than switch to PostgreSQL)?
Rails Testing: To Stub or Not to Stub
Over the last couple of years testing has been a controversial topic in Rails. In the beginning though Rails was of course opinionated and gave us model and controller tests along with fixtures for testing our apps. Then integration tests were added by Jamis. Some started using browser testing with tools such as Selenium and Watir. Then the whole RSpec movement swept in with a new terminology, a heavier use of mocking and stubbing, and more isolated testing of the different layers of the MVC stack. One of the most important trends right now seems to be to do "Outside In TDD" with Cucumber and webrat. There are alternatives to RSpec like Shoulda and a move towards simpler tools such as Jeremmy Mcannalys Context and Matchy libraries. There are a number of Factory libraries for replacing fixtures. Maybe the most important controversy over the years has been on whether the database should be stubbed out or not. One of the most common arguments for stubbing out the database is to keep the test execution time low.
Personally I've always been a stubbing sceptic. Given all the changes in the testing landscape it's interesting to read that at Thoughtworks there is a movement away from stubbing and back to where it all started:
"As the teams became familiar with using method stubbing, they used it more and more - falling into the inevitable over-usage where unit tests would stub out every method other than the one being tested. The problem here, as often with using doubles, is brittle tests. As you change the behavior of the application, you also have to change lots of doubles that are mimicking the old behavior. This over-usage has led both teams to move away from stubbed unit tests and to use more rails-style functional tests with direct database access."
Rails Development Database Setup Without Migrations
I posted over at the Newsdesk developer blog about Rails Development Database Setup Without Migrations.
The Newsdesk Developer Blog
I have been doing a bit of blogging over at the Newsdesk developer blog. My two most recent posts are on shelling out to external programs and eager loading of application classes to save STI.
Rails Gotcha: ActiveRecord##Base.valid?, errors.empty?, and before_validation Callbacks
The ActiveRecord::Callbacks module (that depends on ActiveSupport::Callbacks) defines a multitude of before and after callbacks for the lifecycle of your ActiveRecord objects. Similar to how before filters in controllers work, if a before_validation callback method returns false, the save process will be aborted. Here is what the API documentation says:
# If the returning value of a +before_validation+ callback can be evaluated to +false+, # the process will be aborted and Base#save< will return +false+. # If Base#save! is called it will raise a ActiveRecord::RecordInvalid exception. # Nothing will be appended to the errors object. # If a before_* callback returns +false+, all the later callbacks and the associated # action are cancelled. If an after_* callback returns # +false+, all the later callbacks are cancelled. Callbacks are generally run in the # order they are defined, with the exception of callbacks # defined as methods on the model, which are called last.
What this means is that if a before_validation callback returns false, then save and valid? will both return false but errors.empty? will return true. Yes, you read correctly, there are no validation errors but the record is not valid. Also, note that the valid? method defined in the ActiveRecord::Base class will never even be invoked.
Rails Tip: Running Tests with Verbose Output
If you want to run your Rails tests with verbose output you can use the TESTOPTS argument like this:
rake test TESTOPTS="-v"
Now instead of just getting dots as progress indicators you get a list of test methods (blocks) and test cases like this:
test_create_with_errors(NewsletterTest): . test_destroy(NewsletterTest): . test_send_all(NewsletterTest): F test_send_one(NewsletterTest): F test_update(NewsletterTest): . test_is_active?(PlanTest): .
It took me a while to track down the TESTOPTS argument. Let's look at the chain of execution when you invoke rake in a Rails app. First, rake will load the Rakefile in the rails root which in turn will boot up your Rails app, require rake itself, the rake test task, as well as all the Rails rake tasks. The default Rails rake task is test and it is defined in the testing.rake file in railties/lib/tasks. The test Rake task will invoke the test:units test:functionals test:integration Rake tasks in sequence. Those are defined by invoking Rake::TestTask.new. The Rake::TestTask class is defined in the testtask.rb file in your rake gem installation (use gem which rake to find it). It is in this file that the TESTOPTS argument is documented. It turns out that Rake::TestTask.new will execute something like this on the command line (I broke the line up for readability):
ruby -Ilib:test "/Library/Ruby/Gems/1.8/gems/rake-0.8.3/lib/rake/rake_test_loader.rb" "test/functional/application_controller_test.rb" ..all other test files here... -v
Notice the -v (for verbose) option as the last argument. So where does Test::Unit enter the picture? Well, each of your test files requires test/test_helper.rb which in turn requires test_help.rb in your Rails installation which does a require 'test/unit'. You can check out the Test::Unit sources by opening up $RUBY_LIB/1.8/test in your editor. It turns out that it's autorunner.rb in test/unit that parses out the "-v" command line options and automagically runs your tests. It's as simple as that...
Plugin Stack Traces Missing in Rails 2.3.2
I endured a nightmarish debugging session yesterday due to Rails no longer showing proper stack traces from exceptions in plugin code. I've verified this across different Rails apps and it seems to actually be the case that with Rails 2.3.2 you no longer get a stack trace from plugins. Has anybody else had issues with this? Any pointers to where in the Rails code exception handling and stack traces are dealt with and how it could be fixed? I posted about this issue on the Rails core list but haven't received a reply yet.
You're a SlideShare RockStar
SlideShare sent me an email saying I'm a rockstar because my Ruby on Rails 101 slides have been getting so popular. I'm really glad that I decided to share my slides and SlideShare has been a great boost in reaching more people. All the great feedback that I get from people makes all the work that went into the slides more than worthwhile. I continue to be fascinated with how rewarding it can be to share information on the web.
Rails Tip: Using Rakismet to Stop Spam
I am trying out Rakismet now for this blog to prevent comment spam - an issue that has been bugging me for a long time and that has been getting worse lately. Setting up the Rakismet plugin was a breeze. Very nice! Now, let's hope it actually stops the spammers. We use Rakismet at Newsdesk and I think it's worked out really well there.
Really Simple Rails Log Rotatation
I always used logrotate Linux tool to setup log rotation for my Rails apps which has worked fine although it required finding some external config file and understanding its config options and syntax. I never new log rotation could be set up by adding this one line to config/environments/production.rb right in your app:
config.logger = Logger.new(config.log_path, 50, 1.megabyte)
Sweet!
Upgrading from Rails 1.2.3 to Rails 2.3.2
Have you ever tried upgrading Rails from version 1 to version 2? If not, you're in for a treat. Seriously though, it's not that bad, but depending on the size of your app, the number of plugins you use, and how badly you've patched Rails, your mileage may vary.
Here are some notes from an upgrade of this little weblog app just now:
# 0. Upgrade Rails sudo gem install rails rails -v # => Rails 2.3.2 # 1. Since rails:freeze:gems in 1.2.3 is not compatible with RubyGems 1.3.1 (GemRunner issue) and # since it doesn't know about activeresource, we'll use a fresh Rails 2.3.2 for checking out # the sources: cd ~/src rails rails23-app cd rails23-app rake rails:freeze:gems # Generate app and test code to reveal some latest best practices ./script/generate scaffold categories name:string ./script/generate integration_test demo # 2. Create a branch for the upgrade cd ~/src/app_to_upgrade git checkout -b rails23 # Assumes your app is in Git # 3. Upgrade the Rails source rm -rf vendor/rails mv ~/src/rails23-app/vendor/rails vendor # 4. Upgrade config/boot.rb and public/javascripts/*. Rename application.rb to application_controller.rb rake rails:update # 5. Get the config/initializers/* files. Make sure the session settings are right for your app. cp -r ~/src/rails23-app/config/initializers config # 6. Make sure your config/environment.rb and config/environments/* files are Rails 2.3 compatible. # Do this by comparing your versions of the files with the ones in rails23-app. Notes: # # - Do not use config.time_zone = 'UTC' unless your database datetime columns are in UTC # (see http://marklunds.com/articles/one/405) # - You want to keep your environment.rb file fairly small by breaking parts out into # config/initializers/* files. # - ActionMailer::Base.server_settings has been renamed to ActionMailer::Base.smtp_settings # 7. Remove/upgrade/patch any plugins that your app has that may not be Rails 2.3 compatible. # Some stuff has been moved out of Rails and into plugins and then you'll have to install those plugins. # For pagination I recommend will_paginate (http://github.com/mislav/will_paginate/tree/master). # 8. Review any patches that you have made to Rails. # 9. Make your tests Rails 2.3 compatible: # - in test/test_helper.rb it's now the ActiveSupport::TestCase class that should be opened, # not Test::Unit::TestCase. # - Change your unit tests to extend ActiveSupport::TestCase # - Change your helper tests to extend ActionView::TestCase and move them to a new test/unit/helpers directory # - Change your functional tests to extend ActionController::TestCase # - Note that now fixtures :all is the default setting. I highly recommend this setting over # specifying fixtures in each test (see http://marklunds.com/articles/one/386). # You still have to say fixtures :all in integration tests though. # - You may get deprecation warnings when running the tests, such as truncate now taking a hash argument etc. # Fix those :-) # 10. Commit your branch in Git, merge it to master, and deploy.
As a bonus, here is a script to migrate your .rhtml view templates to the new .html.erb extension:
#!/usr/bin/env ruby
#
# change-extension <path> <from-extension> <to-extension>
#
# Command line script to recursively move all files under a certain directory from
# one extension to another. Example usage: moving all .rhtml files in a Rails app to .html.erb to adopt
# the new conventions in Rails 2.
unless ARGV.size == 3
puts "Usage: : <path> <from-extension> <to-extension>"
exit -1
end
root_path, from_ext, to_ext = ARGV
MOVE_COMMAND = "git mv"
Dir[File.join(root_path, "**", "*.")].each do |from_path|
to_path = from_path.chomp(from_ext) + to_ext
command = " "
puts command
system(command)
end
There you have it! Now you can enjoy Rails 2.3.2 in all its glory!
Contributing to Rails Core
At Newsdesk we recently upgraded to Rails 2.3 and ran into a Rails bug triggered by multiple post requests in tests. I was curious to figure out why the bug occured and since I was already familiar with the Rails testing code (in test_process.rb) I was able to nail down the problem pretty quickly. Usually I would probably stopped there, but my colleague Richard encouraged me to submit a Rails patch so I walked the extra mile and wrote a little unit test for my fix. I then followed the contribution instructions to submit an issue in Lighthouse with a patch file followed by the recommended post to the Rails core mailing list. The issue was quickly assigned to Joshua Peek in the Rails core team and a few days later it was applied. It is really encouraging to see how easy it can be to contribute to Rails and that the core team actually picks up a lot of patches and applies them if only they are submitted properly. This really illustrates the power of of open source in general and Git and Rails in particular.
Rails Tip: JavaScript Validation and Testing
JavaScript test coverage is pretty non-existant in most Rails projects and if your application has a lot of JavaScript code this can become a real problem. But it doesn't have to be that way. One thing you can do just to get some basic syntax and style checking of your JavaScript code is to run it through JavaScript Lint. I've added a simple rake task to run all javascript files through the validator:
namespace :test do
namespace :javascripts do
desc "Validate all javascript files with javascript lint - assumes jsl on the command line"
# Visit http://www.javascriptlint.com to download and install the jsl command line tool
task :validate do
total_errors = 0
Dir[File.join(File.dirname(__FILE__), "..", "..", "public", "javascripts", "*.js")].each do |file_path|
print " ... "
result = `jsl -process `
n_errors, n_warnings = result.match(/(\d+) error\(s\), (\d+) warning\(s\)$/).to_a[1, 2].map(&:to_i)
if n_errors > 0
puts "FAILED"
puts result
else
puts "OK ( warnings)"
end
total_errors += n_errors
end
print "TEST RESULT: "
if total_errors > 0
puts "FAILURE - errors"
else
puts "OK"
end
end
end
end
In addition to syntax checking I recommend trying out Dr Nic's javascript_test plugin. To get it to work with the latest prototype I had to download JsUnitTest and use that instead of unittest.js.
Rails Tip: Migrate Your Database to UTC
UPDATE: Adam Meehan pointed out that my migration didn't work with DST, i.e. different UTC offsets for datetimes at different points of the year. I updated my migration to use a UTC conversion in the database (leading to a variable interval) instead of using a fixed interval adjustment.
If you want to make use of the timezone support in Rails 2.1 and later you'll need to migrate any existing times that you have in your db to UTC. Here is a migration for PostgreSQL I wrote to do that (you'll probably need to adjust it to work on MySQL):
# This migration will work with DST. Because of DST, if you have your datetimes
# spread across the year they will have different offset, i.e. in Stockholm we are UTC+1 usually
# but UTC+2 in the summer.
end
When I originally wrote this migration I used a fixed interval adjustment (+1 for Stockholm), i.e. the same approach that Simon Harris uses. However, as Adam Meehan points out in the comments this doesn't work with DST so I adjusted to using a AT TIME ZONE 'UTC' conversion instead that will result in a DST dependent interval. Thanks Adam for pointing this out!
Ruby on Rails 101 Slides Updated for Course in Italy
Last week I had the pleasure of teaching a five day introductory course at Tiscali in Cagliari/Itali (Sardinia). The course outline was similar to the course I held back in 2007 but in this case the lectures were half day with the afternoons dedicated to exercises and questions. This format worked out pretty well and I think it's a good idea to have half of the time devoted to hands-on training.
I've updated my old slides for Rails 2.3 and rewritten them in HTML (using S5/Codex). You can view the slides here. The source code for the slides is available on GitHub. Feel free to send feedback and reuse the slides and make any corrections and improvements that you see fit. Thanks!
Translate: New I18n Rails Plugin with Nice Web UI
Today Newsdesk released the Translate plugin that provides a nice web UI for doing translations. The plugin mounts a web UI at /translate where you can list and translate I18n texts. Translations are written directly to YAML files stored at the default location under config/locales. Check out the post at the Newsdesk blog and the README file on Github for more details.
Rails Timezone Gotcha: ActiveRecord::Base.find does not convert Time objects to UTC
With the timezone support introduced in Rails 2.1 the idea is that all dates in the database are stored in UTC and all dates in Ruby are in a local timezone. The local timezone can be specified by config.timezone in environment.rb or set to the user timezone with Time.zone= in a before filter. Typicaly, when reading/writing from/to the database ActiveRecord will transparently convert time attributes back and forth to UTC for you. However, there is a gotcha with datetimes in ActiveRecord::Base.find conditions. They will only be converted to UTC for you if they are ActiveSupport::TimeWithZone objects, not if they are Time objects. This means that you are fine if you use Time.zone.now, 1.days.ago, or Time.parse("2008-12-23").utc, but not if you use Time.now or Time.parse("2008-12-23"). Example:
Apparently this issue has been reported and marked as invalid. I think it's quite unfortunate that ActiveRecord doesn't do this conversion for us. I suspect other application developers will be bitten by this as well. The difference in behaviour between Time and TimeWithZone objects boils down to the to_s(:db) call:
>> Time.now.to_s(:db) Time.now.to_s(:db) => "2009-01-06 17:52:19" >> Time.zone.now.to_s(:db) Time.zone.now.to_s(:db) => "2009-01-06 16:52:23"
One way to fix the issue would be to monkey patch the Quoting module in ActiveRecord like this:
# :nodoc:
# Convert dates and times to UTC so that the following two will be equivalent:
# Event.all(:conditions => ["start_time > ?", Time.zone.now])
# Event.all(:conditions => ["start_time > ?", Time.now])
value.respond_to?(:utc) ? value.utc.to_s(:db) : value.to_s(:db)
end
end
end
end
However I'm not sure that this is a good idea and that it won't break anything else. I've at least verified that it doesn't break assignment of ActiveRecord attributes.
Introducing Simple Signup - Easier Event Signups and Payments
Simple Signup is a web application that I've been working on for a while now that makes it easier for event organizers to accept signups and payments. Bodil Abelsson, who is now my business partner, approached me about a year ago with the idea and it made sense to me right away, especially since I had needed a service like this myself. If you think about it, being able to accept signups and payments is quite a common need in everything from private parties, concerts, sports, conferences, courses, and club memberships. Maintaining attendee lists and matching them up with payments made to your bank account can be quite a hassle. It is time consuming and error prone. One would expect there to be any number of established websites by now that solve this problem. However, there really isn't. Many small organizers still handle payments and signups manually. Some of the medium size organizers have systems that they have developed themselves and that are specialized for their niche. However there aren't that many services out there that recognize the generic nature of the problem. There is some competition in Germany and the US but we seem to be the first service of this kind in Sweden.
The swedish version of Simple Signup (simplesignup.se) was launched quietly this summer. This was followed by a recent launch of the english version (simpleeventsignup.com) along with the ability to accept payments via Paypal. Swedish event organizers don't need a Paypal account but instead typically choose to have payments handled by Simple Signup. This works by having the attendee pay us via our Payment Provider Payex by credit card (VISA or MasterCard). We then transfer the money directly to the organizers bank account. To be able to cover credit card and bank fees we charge a 4% signup fee (at least 10 SEK).
We are continually improving the site and have plans to partner up with other websites that provide services related to events. I'll have more to say about this soon. One of the bigger and more important features that we will be adding is the ability for the event organizer to design their own signup page.
We still have a long way to go, but at least I think we are off to a good start. If you have any feedback, positive or negative, it is always greatly appreciated.
MySQL Performance
Every now and then I like to pick on MySQL and it's become something of a running theme in this blog. My session table for this blog (which runs on Ruby on Rails of course) had grown too big so I needed to clean it up. I just didn't expect it to take this long:
mysql> select count(*) from sessions; +----------+ | count(*) | +----------+ | 545797 | +----------+ 1 row in set (2.89 sec) mysql> delete from sessions; Query OK, 545801 rows affected (5 min 46.67 sec)
Should it really take 3 seconds to count half a million rows? I wonder if PostgreSQL would deliver better performance in this instance. As soon as I find the time I will switch over my Rails/MySQL based web application Simple Signup to PostgreSQL. I have a strong personal preference for PostgreSQL over MySQL. I'm curious to see what the transition will be like though.
Rails Tip: SEO Friendly URLs (Permalinks)
There are several plugins already out there that can turn the typical Rails show path of /articles/show/1 into something more search engine friendly. I decided to roll my own implementation and it turned out to be fairly easy. My solution relies on the Slugalizer library by Henrik Nyh. First, I make sure we can turn any string into something URL friendly by patching the String class:
# Convert String to something URL and filename friendly
no_slashes = self.gsub(%r{[/]+}, separator)
Slugalizer.slugalize(no_slashes.swedish_sanitation, separator).truncate_to_last_word(max_size, separator)
end
# We need this for SEO/Google reasons sincen å should become aa and Slugalizer translates å to a.
dup = self.dup
dup.gsub!('å', 'aa')
dup.gsub!('Å', 'aa')
dup.gsub!('ä', 'ae')
dup.gsub!('Ä', 'ae')
dup.gsub!('ö', 'oe')
dup.gsub!('Ö', 'oe')
dup.gsub!('é', 'e')
dup.gsub!('É', 'e')
dup
end
dup = self.dup
if dup.size > length
truncated = dup[0..(length-1)]
if truncated.include?(separator)
truncated[/^(.+)/, 1]
else
truncated
end
else
dup
end
end
end
All I have to do in my ActiveRecord model then is to override the to_param method:
if name.present?
" -"
else
id
end
end
permalink
end
ActiveRecord will automatically ignore any non-digit characters after the leading digits in an id that you pass to it, but just to be on the safe side I added a before_filter to my application controller that will convert permalinks to proper ids:
if params[:id].present? && params[:id] =~ /\D/
params[:id] = params[:id][/^(\d+)/, 1]
end
end
Credit for parts of this code goes to my cool programmer colleagues over at Newsdesk.se.
Rails Hack: Auto Strip ActiveRecord Attributes
We have a user who unintentially enters a space after his email address. It seems that a lot of times it makes sense to automatically strip ActiveRecord model attributes before they are validated. Inspired by this post I came out with my own auto_strip method that adds a before validation callback which seems less intrusive than redefining the attribute setter method:
# Sometimes users accidentally enter space before or after text in text fields. Let's not punish them
# with an error message for this.
attributes.each do |attribute|
before_validation do |record|
record.send(" =", record.send(" _before_type_cast").to_s.strip) if record.send(attribute)
end
end
end
end
end
Now in my ActiveRecord model I can say for example:
Maybe auto stripping would be useful as an option to the Rails validation macros?
Rails I18n: Array#to_sentence and :skip_last_comma
The Array#to_sentence method in ActiveSupport joins an array into a string with a locale dependent connector such as "and" for english and "och" for swedish. The sentence connector is determined by the I18n message support.array.sentence_connector. The issue is that in english you have a comma before the last connector whereas in swedish you don't. The existance of the last comma is set with the :skip_last_comma option. Ideally we would like this to always be true for the swedish locale. Therefore I added the I18n key support.array.sentence_skip_last_comma and patched the to_sentence method:
alias_method :to_sentence_orig, :to_sentence
extra_options = {}
skip_last_comma = I18n.translate(:'support.array.sentence_skip_last_comma')
extra_options[:skip_last_comma] = skip_last_comma if skip_last_comma !~ /translation missing: /
to_sentence_orig(options.reverse_merge(extra_options))
end
end
Here is the corresponding spec:
it "can join array elements with Array#to_sentence" do
with_locale("en-US") {["knatte", "fnatte", "tjatte"].to_sentence.should == "knatte, fnatte, and tjatte" }
with_locale("sv-SE") {["knatte", "fnatte", "tjatte"].to_sentence.should == "knatte, fnatte och tjatte" }
with_locale("sv-SE") do
["knatte", "fnatte", "tjatte"].to_sentence(:skip_last_comma => false).should == "knatte, fnatte, och tjatte"
end
end
Upgrading to Rails 2.2 and Drinking the I18n Koolaid
The other day I got tired of waiting for the Rails 2.2 release and upgraded my application from Rails 2.1 to edge. The process went a little something like this:
rake rails:freeze:edge cd vendor/plugins rm -rf rspec rm -rf rspec_on_rails git clone git://github.com/dchelimsky/rspec.git rspec_on_rails git clone git://github.com/dchelimsky/rspec-rails.git rm -rf rspec/.git rm -rf rspec-rails/.git cd ../../ ./script/generate rspec
At this point I ran my Test::Unit tests and RSpec specs with rake and ended up with a single failing test and a bunch of warnings:
- A controller test failed due to some routing change. It expected a redirect to http://test.host/ticket_types/106767319 but got http://test.host/events/592144301/ticket_types/106767319. I fixed this by commenting out the test since I was going to remove the UI anyway.
- DEPRECATION WARNING: The binding argument of #concat is no longer needed. I simply removed the binding argument.
- ActiveSupport namespacing. I needed to move OrderedOptions into ActiveSupport::OrderedOptions in the app_config plugin.
- DEPRECATION WARNING: truncate takes an option hash. I changed my invocations of truncate to use a hash.
Overall, it was fairly easy to transition from 2.1 to edge. The really interesting part though was I18n. Thanks to the new I18n support in Rails I got to throw out the Simple Localization plugin that apparently is no longer supported, as well as my own hack to get error messages to be localized. I used the I18n demo app as a starting point and added config/locales/sv-SE.yml. It turned out the structure of I18n keys had changed a little since the demo app was written. You can use my file as a starting point or probably better, copy the following files into your single locales file and translate those:
vendor/rails/actionpack/lib/action_view/locale/en-US.yml vendor/rails/activerecord/lib/active_record/locale/en-US.yml vendor/rails/activesupport/lib/active_support/locale/en-US.yml
I source the locales file from config/initializers/i18n.rb:
I18n.default_locale = 'en-US'
LOCALES_DIRECTORY = " /config/locales/"
locale_files = Dir[" /*.{rb,yml}"]
LOCALES_AVAILABLE = (locale_files.collect do |locale_file|
File.basename(File.basename(locale_file, ".rb"), ".yml")
end + ['en-US']).uniq
locale_files.each {|locale_file| I18n.load_translations locale_file }
I am using the gibberish and gibberish_translate plugins and am quite happy with those. Of course, it would be nice if they were rewritten to use the new I18n API. Another TODO item is to move over attribute names from Gibberish to I18n. I have my own override of ActiveRecord::Base.human_attribute_name that is no longer needed now that the 2.2 version of the method so nicely does I18n message lookups (with a key naming convention like 'activerecord.attributes.model_name.attribute_name').
Thanks so much to Sven Fuchs and the I18n team, Jeremy Kemper, and all the others who made I18n a part of Rails! Code will be cleaner from now on and life easier...
Test Driven Development with Ruby
I am closing my DreamHost account, partly because they have security issues and partly because I just don't need it anymore. I had to move my Test Driven Development with Ruby article to a new home here at marklunds.com. Hopefully Google will pick up the new page and update its index soon.
Modularizing Your Rails App with Concerns
In his keynote here at RailsConf Europe 2008, David (DHH) talked about living with legacy code, how we should enjoy it instead of trying to avoid it, and how it can give us new insights by showing us how we have grown as developers. I loved the keynote and it resonated really well with my own experiences. It's also highly relevant with my current work at Newsdesk and Simple Signup.
As an example of refactorings you might find yourself doing in legacy Rails apps, David showed us how to break a big application helper or a fat model into Ruby modules. The idea was to find groups of methods that represent a certain concern or aspect of your app and collect those in a module. This is typically not done for reuse but to make your code more readable and easier to navigate.
Last week I found myself creating a plugin for a certain aspect of my application, namely the acceptance of terms of its service. It was a minimalist plugin with just a helper method and a controller filter method. The code was highly application specific and thus it felt wrong to keep it in the plugin directory. After all, plugins are supposed to be shared across applications and my plugin was inherently tied to the application. Still, I wanted to have the ability to keep an aspect of my application in its own directory, especially when the aspect spans across several layers of MVC. The solution I came up with was to create a new directory RAILS_ROOT/app/concerns with a sub directory and an init.rb file for each concern, very much like with plugins. I then generated a concerns plugin to make sure my concerns get loaded:
Dir[File.join(RAILS_ROOT, "app", "concerns", "*", "init.rb")].each do |init_file| require init_file end
I talked to David (DHH) about this approach and the funny thing was that he had experiemented with it too. David says the approach can be appropriate if you use it wisely. If you overuse it your concerns directory will end up being a new "garbage can" and bring you back to the problem that you were trying to solve with concerns in the first place... :-)
Introducing Rails Mentor
Me and fellow Rails developer Carl-Johan Kihlbom from Gothenburg have just founded Rails Mentor. Rails Mentor is a network of Ruby on Rails experts and we offer mentorship and training in anything related to Ruby on Rails. Me and Carl-Johan are committed to Rails best practices and together we have a broad experience of applying Ruby on Rails to varying types of projects. Now we would like to help spread this knowledge to others. If you need to be brought up to speed quickly with Rails or need a code or architecture review, please don't hesitate to get in touch with us.
We are currently at RailsConf in Berlin and we are giving free Rails Mentor t-shirts so come talk to us if you want one!
Installing Ruby on Rails on Ubuntu 8.04 Hardy Heron
I've compiled a set of detailed instructions on how to install Ruby on Rails on Ubuntu 8.04 Hardy Heron. The instructions are available in GitHub and they show you how to turn a clean Ubuntu 8.04 install into a production ready Ruby on Rails stack including MySQL, Nginx with fair proxy balancer, Monit, and Mongrel Cluster. There are a few additions and improvements I'd like to make:
- A more recent version of Ruby
- Logfile rotation
- Production log analyzer
- PostgreSQL
Please help me point what else is missing or how I can improve.
I'm talking to GleSYS about the possibility of offering my production setup as an installable VPS image.
Faster Capistrano Subversion Deployments with remote_cache
This is old news, but if you are deploying an application with Capistrano from Subversion and you find yourself waiting impatiently for the checkout to happen everytime, try adding this to deploy.rb:
set :deploy_via, :remote_cache
With this setting Capistrano will keep a Subversion checkout under shared/cached-copy, and on each deploy just do an svn update followed by a local copy of the whole tree into releases. I found this to be significantly faster than the default checkout on each deploy.
I am planning to switch to Git and GitHub now, and I am curious to see if this setting will affect deployment speed with Git. Given that a git clone in my experience is blazingly fast, maybe the remote cache won't be needed?
Installing Telenor 3g Modem (Mobilt Bredband) on Ubuntu
As far as I know none of the turbo 3g GSM modems on the swedish market officially support Linux. However, I was able to get my Telenor modem working just fine on my Ubuntu Eee just now. I used the USB_ModeSwitch software along with swedish instructions from Hasain. Once my modem was recognized I ran sudo wvdialconf. I then edited my /etc/wvdial.conf to be:
[Dialer Defaults] #Init1 = ATZ Init1 = AT+CPIN=<YOUR PIN HERE> Init2 = ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0 Modem Type = Analog Modem Baud = 9600 New PPPD = yes Modem = /dev/ttyUSB0 ISDN = 0 Phone = *99# Password = peter Username = peter
Note that you have to change the PIN above. I then ran sudo wvdial and voila - I was online! Well, online on a slow and unreliable connection that is. Oh well, I'm thinking of changing to tre.se one of these days. They supposedly have the best 3g network.
Asus Eee + Ubuntu + Rails
On thursday I bought myself an Asus Eee 900 - a tiny and cheap Linux powered laptop that is currently selling out in stores here in Sweden. I got a lot of attention in the office with this laptop and within a day it seemed every programmer in the office had an Eee on their desk.
I installed Ubuntu Eee and this was a huge improvement over the Linux OS that Asus provides. I am ashamed to admit that I haven't used Linux on the desktop for a long time and I was totally blown away by how advanced, slick, and user friendly Ubuntu has gotten.
Obviously, my ultimate goal was to install Ruby on Rails. At first I wanted to install from source in order to get an exact version and patch level of Ruby, namely the one that is officially recommended on ruby-lang.org. However, when attempting this various libraries were missing. I found a FiveRuns article listing the packages I needed but after I had installed them I ran into an issue with the MD5 library. In the end I resorted to using the ruby-full package which gives you the old tried and tested Ruby 1.8.6 patch level 111 (without the recent security patches). Here, roughly, are the steps I went through to set up my Rails environment:
##################################### # RUBY ##################################### sudo apt-get install ruby-full which ruby ruby -v # => ruby 1.8.6 (2007-09-24 patchlevel 111) [i486-linux] ruby -ropenssl -rzlib -rreadline -e "puts :success" ##################################### # RUBYGEMS ##################################### # Get latest stable recommended release of RubyGems from rubygems.org wget http://rubyforge.org/frs/download.php/38646/rubygems-1.2.0.tgz tar xzf rubygems-1.2.0.tgz cd rubygems-1.2.0/ sudo ruby setup.rb # Not sure if/why this step is necessary sudo ln -s /usr/bin/gem1.8 /usr/bin/gem which gem gem --version # => 1.2.0 ##################################### # MYSQL ##################################### sudo aptitude install mysql-server mysql-client libmysqlclient15-dev ##################################### # Some useful Gems ##################################### sudo gem install rails mongrel capistrano mysql
I haven't double checked those instructions for accuracy since I would have to re-install my OS to do that. If you find errors or have improvements, please let me know. Here is a sample of articles on the subject that you might want to check out if you want to dig deeper:
Rails Configuration: Yielding self in initializer
I've come across situations in Rails where you want to repeatedly invoke methods on some class with a long name and it gets ugly and tedious that you have to repeat the class name on every line. I just realized that unlike in the config/environments files the config object is not available in the config/initializer files. I use the AppConfig plugin to parameterize my application and I came up with the yield_self method to make my config/intializers/app_config.rb more readable:
# Method to supplement the app_config plugin. I want to crash early and be alerted if
# I have forgotten to define a parameter in an environment.
AppConfig.param(name) do
raise("No app_config param ' ' defined in environment , " +
"please define it in config/environments/ .rb and restart the server")
end
end
yield self
end
end
AppConfig::Base.yield_self do |config|
config.site_name = "Simple Signup"
config.admin_email = '"Simple Signup" <info@simplesignup.se>'
config.exception_emails = %w(info@simplesignup.se)
config.email_prefix = "[ ]"
config.signup_timeout = 5
config.send_activate_email = false
config.transaction_fee = lambda do |price|
price.blank? ? 0.0 : [10.0, 5.0+0.035*price.to_f].max.round(2)
end
config.bank_fee = lambda do |price, card|
0.0
end
end
ExceptionNotifier.exception_recipients = config_param(:exception_emails)
ExceptionNotifier.sender_address = %{peter_marklund@fastmail.fm}
ExceptionNotifier.email_prefix = " ERROR: "
The method config_param might be more appropriately named app_config. That will be a refactoring for another day though.
Ruby Gotcha: Default Values for the options Argument Hash
Like in Java, method arguments in Ruby are passed by reference rather than by copy. This means it's possible for the method to modify the arguments that it receives. This can happen unintentionally and be a very unpleasant surprise for the caller. A good example of this is the typical options = {} at the end of the method argument list. If you set a default value in that hash then that is a potential issue for the caller when the options hash is reused in subsequent calls (i.e. in a loop). See below for an example:
[:to_table] ||= from_column.to_s[/^(.+)_id$/, 1].pluralize
execute ["alter table ",
"add constraint ",
"foreign key ( )",
"references (id)",
""].join(" ")
end
from_columns.each do |from_column|
foreign_key(from_table, from_column, options)
end
end
options
In the first invocation of foreign_key options[:to_table] will be set (if it isn't set already) to the destination table of the first column. The options[:to_table] value will be retained throughout the loop causing all foreign keys to point to the same table. The fix is to make to_table a local variable or to do add an "options = options.dup" line at the beginning of the method.
Lesson learned - avoid modifying the options hash and any other arguments that the caller doesn't expect will be modified.
Rails Testing Tip: Use Global Fixtures to Avoid Fixture Mayhem
I'm in a Rails team with mixed opinions on whether to use fixtures. Therefore we have everything from RSpec specifications that use a lot of mocking/stubbing and don't touch the database, to specifications that set up their own databse data through helper methods and the specifications that I write that rely mostly on fixture data. What I have found is that when you don't use global fixtures (a setting in your test_helper.rb or spec_helper.rb file) you can run into situations where seemingly unrelated specifications/tests fail, and fail in different ways depending on if you run them in separation, through autotest, or with rake. What is going on is test data spillover/interference between tests. This can lead to very long and frustrating debugging sessions indeed. The best way to avoid this seems to be to turn on global fixtures. This will probably increase the specification run time, an issue that I partially adress by keeping the number of records in my fixture files to a minimum. Also, I prioritize test coverage and convenient access to a common set of test data over making my specifications run faster.
Rails Optimistic Locking - Not Worth it for Me
When I upgraded to Rails 2.1 ActiveRecord partial updates were turned on, i.e. when you save a record only those attributes that have changed are saved to the database. In theory, if you have two almost simulateneous updates, and you have a validation rule across several columns, then those updates can render the database record in an invalid state. Of course, in practice, this is very unlikely to happen. Nevertheless, I decided to turn on optimistic locking to be on the safe side. It turned out optimistic locking caused more issues than it was worth. Suppose you have an Article model that has many instances of Chapter and also that the Article uses a counter cache. Then you can run into this issue:
article = Article.first
article.chapters.create(:name => "Summary")
# => UPDATE articles SET chapters_count = chapters_count + 1,
# lock_version = lock_version + 1 WHERE id = XXX;
article.publish_date = 1.days.from_now
article.save
# => throws ActiveRecord::StaleObjectError
I like partial updates though since they make it less likely that simulteneous updates will clobber, since you are only writing to the db what has changed. It also makes SQL statements in the log file more readable.
I'm abandoning optimistic locking for now though.
Ruby Gotcha: Symlinked Scripts and File.dirname(__FILE__)
If you have a Ruby script say in ~/src/ruby/my_script that you are symlinking to from ~/bin/my_script, then invoking File.dirname(__FILE__) in that script will yield the directory of the symlink not the directory of the script file. If you want the directory of the script file you can do this instead:
THIS_FILE = File.symlink?(__FILE__) ? File.readlink(__FILE__) : __FILE__
THIS_FILE will contain the path to the script file instead of the path to the symlink. This is valuable if say you want to require some Ruby library from your script and you are using a relative path.
Rails Tip: Validating Option Arguments
I think it's a good convention to validate options arguments passed to methods, i.e. to make sure they have valid keys and values. Misspellings can otherwise lead to unnecessary debugging sessions. Rails comes with the assert_valid_keys method. I added the assert_value method:
#:nodoc:
#:nodoc:
#:nodoc:
# Assert that option with given key is in list
options.assert_valid_keys([:in])
if !(options[:in].map(&:to_s) + ['']).include?(self[key].to_s)
raise(ArgumentError, "Invalid value ' ' for option ' ' must be one of ' '")
end
end
end
end
end
end
Example usage and RSpec specification:
.assert_valid_keys([:billing_period])
options.assert_value(:billing_period, :in => ["annual", "monthly"])
# method logic here...
end
# Rspec:
it "cannot be invoked with an invalid option" do
lambda { @campaign.price([23], :foobar => true)
}.should raise_error(ArgumentError)
end
it "cannot be invoked with an invalid billing period" do
lambda { @campaign.price([23], :billing_period => :foobar)
}.should raise_error(ArgumentError)
end
options
Rails plugin: acts_as_state_machine_hacked
I've been using the acts_as_state_machine plugin in a couple of projects. I think the syntax and functionality of the plugin is quite nice. It allows you to easily define states and events (transitions between states) for your ActiveRecord model. However, I wanted to be able to see which states are available in the current state. Also I thought that invoking an event, such as user.activate!, when the user is in a state where the activate event is not available should not yield a silent failure, but rather throw an exception. Also, if the event fails to fire because of a guard condition then an exception should also be raised. I encapsulated those changes in the plugin acts_as_state_machine_hacked.
DreamHost Hacked: All My Files Exposed Publicly
An ex-colleague of mine discovered that all my files in my home directory at the hosting company DreamHost were publicly viewable and downloadable on the web. I was quite shocked. I had certainly not intended to share all my private files with the world, especially since they contained some highly sensitive information. I assumed my account at DreamHost had been hacked. However the response from DreamHost support was that this was not the case. They explained that it was merely a symbolic link to my home directory that had been created:
"if you would like to keep this from happening you can prevent all other users on the server from viewing your account's files by enabling the Enhanced Security feature for your user. Just go to the Users > Manage Users section of your panel, click the "Edit" link next to your user, and then check to enable the Enhanced Security option. Hit the "Save Changes" button and you should be set in about 20 minutes.
The /home/ directory is public and it is not a security breach that the other user was able to create a symbolic link to /home/. Other users on the server have always had the same access, which means that they have been able to view your files but they absolutely cannot make any changes to your files or folders. The Enhanced Security feature takes it a step further and prevents any user from even viewing your files or folders.
So, just to be clear, there is no indication of a server hack or any security intrusion."
I wrote back that I had changed to "Enhanced" security and that my files were still exposed. Here are some excerpts from their second reply:
"Ultimately this was just some funny permissions on your home directory which caused this to be allowed to happen."
"When I changed your home directory's group ownership back to your default group (pg136611) this corrected the insecurity of other user's accessing your files via apache"
"The interesting part is it may have been enabling the extra web security which caused this insecurity."
A few days later I found that my files were still exposed and I had to manually change the group of my home directory. Basically as far as I'm concerned this means the issue has still not been fixed in a reliable fashion.
I've heard no apology from DreamHost so far. In fact, there is not much in their replies that indicates that they are even taking the issue very seriously. I'm quite disappointed and I am not left with much confidence in DreamHost when it comes to security and privacy.
Rails Testing Tip: Validate your Fixtures
I realized today how important it can be to validate the fixtures you use in your Test::Unit tests or RSpec specifications. I made some schema and validation changes and neglected to update all my fixtures which lead to a long and tedious debugging session. I added this RSpec specification to make sure I never have invalid fixtures again:
describe "fixtures" do
it "should be valid" do
ActiveRecord::Base.fixture_tables.each do |table_name|
klass = table_name.to_s.classify.constantize
klass.send(:find, :all).each do |object|
puts(" is invalid: ") if !object.valid?
object.should be_valid
end
end
end
end
Note: the fixtures_tables method is just a method I have defined that returns a list of all my fixture tables and I use it to set global fixtures in my test_helper.rb and spec_helper.rb files. If you are not using global fixtures, you can use this spec instead:
describe "fixtures" do
it "should be valid" do
Fixtures.create_fixtures(fixture_path, all_fixture_tables)
all_fixture_tables.each do |table_name|
begin
klass = table_name.to_s.classify.constantize
klass.send(:find, :all).each do |object|
puts(" is invalid: ") if !object.valid?
object.should be_valid
end
rescue NameError
# Probably a has and belongs to many mapping table with no ActiveRecord model
end
end
end
Spec::Runner.configuration.fixture_path
end
Dir[File.join(fixture_path, "*.yml")].map {|file| File.basename(file[/^(.+)\.[^.]+?$/, 1]) }
end
end
I think it would be nice if Rails/RSpec has fixture validation built in and turned on by default.
Rails Tip: Validating Option Arguments in your Methods
I think it's a good convention to validate that options passed to methods have valid keys and values. Misspellings can otherwise lead to unnecessary debugging sessions. Rails comes with the Hash#assert_valid_keys method. I added the assert_value method:
#:nodoc:
#:nodoc:
#:nodoc:
# Assert that option with given key is in list
options.assert_valid_keys([:in])
if !(options[:in].map(&:to_s) + ['']).include?(self[key].to_s)
raise(ArgumentError, "Invalid value ' ' for option ' ' must be one of ' '")
end
end
end
end
end
end
Example usage:
.assert_valid_keys([:billing_period])
options.assert_value(:billing_period, :in => ["annual", "monthly"])
...
options
Corresponding RSpec specifications:
it "cannot be invoked with an invalid option" do
lambda { @campaign.price([23], :foobar => true)
}.should raise_error(ArgumentError)
end
it "cannot be invoked with an invalid billing period" do
lambda { @campaign.price([23], :billing_period => :foobar)
}.should raise_error(ArgumentError)
end
RSpec Presentation
I gave a presentation on RSpec today at Diino.com - an online backup provider - and the slides are available here.
Stockholm Ruby User Group Meetup Tonight
We are having a Stockholm Ruby User Group (SHRUG) meetup tonight at Connecta here in Stockholm and it's good fun. Martin Kihlgren demoed his Grusome server based 2D game that uses a server written in Ruby and C for performance. Albert Ramstedt organized an RRobots workshop where you get to code your own little Ruby Robot that will fight other robots.
I gave the presentation "Building Web Apps with Rails 2 - Conventions and Plugins I Use" and the slides are available.
Rails Internationalization with new gibberish_translate Plugin
We needed to internationalize the user interface of a Rails application that we are building and looked at a plethora of alternatives, such as Globalize, Globalite and the localization plugin by DHH. Finally we settled for the Gibberish plugin. Gibberish uses an unusual hybrid approach with both inline english texts in the code and keys for message lookups. A typical Gibberish lookup looks like this:
To complement the Gibberish plugin I've drafted the gibberish_translate plugin that adds a script for extracting all message lookups from a Rails app and a controller with a web UI for doing translations. The plugin also keeps track of which english texts a translation was made from so that you can flag when english texts are changed. The plugin avoids using the database and works directly against the YAML message files in the lang directory.
Rails Gotcha: Date Arithmetic, Time Zones, and Daylight Savings
Timezones and daylight savings can cause us programmers a lot of headache. This is evidenced by the fact that probably the most popular post in this blog ever is the one about a timezone aware datetime picker. Today I spent several hours debugging a problem a client was having with an eternal loop caused by the daylight saving transition October 28/29 in conjunction with 1.day.since. One of the chapters in the Code Review PDF by Geoffrey Grosenbach is about keeping time in your Rails applications in UTC. After todays exercises I must say I could not agree more with Geoffrey. In fact, I would go as far as to say that date arithmetic in Rails 1.2 is not reliable unless you set your ENV['TC'] variable.
In Europe we have daylight savings time between the last sunday in March and the last sunday in October, so this year it was between March the 25th and October 28th. Here in Stockholm this means that in the summer we are UTC+2 and in the winter (i.e. most of the time...) UTC+1. We can see this in the Rails console:
>> ENV['TC'] => nil >> Time.parse("2007-03-25") => Sun Mar 25 00:00:00 +0100 2007 >> Time.parse("2007-03-25").dst? => false >> Time.parse("2007-03-26") => Mon Mar 26 00:00:00 +0200 2007 >> Time.parse("2007-03-26").dst? => true >> Time.parse("2007-10-28") => Sun Oct 28 00:00:00 +0200 2007 >> Time.parse("2007-10-28").dst? => true >> Time.parse("2007-10-29") => Mon Oct 29 00:00:00 +0100 2007 >> Time.parse("2007-10-29").dst? => false
To see how the daylight savings transitions can be a problem. Suppose you have the date string "2007-10-28" (just before the transition) and you want to get the date string for the following day. In Rails 1.2.5 with Ruby 1.8.5 this will happen:
>> 1.day.since Time.parse("2007-10-28") => Sun Oct 28 23:00:00 +0100 2007
Notice how we are jumping to 23:00 the same day instead of 00:00 the next day. With Edge Rails (soon to be Rails 2.0), we get this instead:
1.day.since(Time.parse("2007-10-28")) => Mon Oct 29 00:00:00 +0100 2007 >> 1.day.since(Time.parse("2007-10-28")).strftime("%Y-%m-%d") => "2007-10-29"
In Rails 1.2.5, 1.day.since(date) is the same as doing (date+24*60*60). In Rails 2.0 though 1.day.since(date) will advance only the day part of the date and leave the hours alone, thus making sure the day is incremented even when crossing over a daylight savings transition.
To be sure to avoid any issues related to daylight savings, make sure to set this in your environment.rb:
ActiveRecord::Base.default_timezone = :utc # Store all times in the db in UTC ENV['TZ'] = 'UTC' # Make Time.now return time in UTC
Then use tzinfo or some other library to adjust the time to/from the local time when you display it and retrieve it from the user.
Note: this post was updated the morning after I posted it because when I first wrote it I was apparently too tired to understand what was going on... :-)
Ruby on Rails 101: Presentation Slides for a Five Day Course
UPDATE: The slides have been updated for Rails 2.3 (in February 2009) and are available here.
I've decided to share the presentation slides that I developed for the five day introductory Ruby on Rails course that I held in June here in Sweden. All in all it's 340 slides available under a creative commons license. You can download the slides as a PDF file here or view them over at Slideshare. To give you an idea of what's inside, here are the chapters:
- Rails Introduction
- Ruby
- Migrations
- ActiveRecord Basics
- ActionController Basics
- ActionView Basics
- Testing
- ActiveRecord Associations
- ActiveRecord Validations
- ActiveRecord Callbacks
- ActionView Forms
- Filters
- Caching
- AJAX
- Routing
- REST
- ActionMailer
- Plugins
- ActiveSupport
- Rails 2.0
- Deployment
- Resources
- Parting Words of Advice
I hope the slides will be useful in helping people learn and teach Rails. I'd like to thank David Heinemeier Hanson for creating such a wonderful framework and Dave Thomas for doing such a great work documenting it.
Rails Search Plugin Review: acts_as_fulltextable
The other day I had the opportunity to install and evaluate the recently announced acts_as_fulltextable Rails plugin. The raison d'etre of the plugin is to offer easy access to the MySQL full-text search engine. The plugin uses a FulltextRow ActiveRecord model stored in a MyISAM table with a polymorphic reference (id and model name) to the model that you want indexed and a single column to hold the search index text. All you need to do is basically add an acts_as_fulltextable declaration to the model that you want to search and this will then create a has_one :fulltext_row relationship and the appropriate callbacks to populate the index on create, update, and delete.
The acts_as_fulltextable plugin mostly exceeded my expecations, especially in terms of easy setup and maintenance. It makes site-wide search (across all models) as well as model specific search (constrain your search to one or more models) very easy. A strength of the plugin is the simplicity. There is very little code and the architecture is straight forward and easy to grasp.
Something to be aware of is that MySQL uses OR searches by default. I wanted to follow the Google convention of AND searches and found myself having to prefix all query terms with a plus sign to accomplish this. Is there a better way? A more serious issue is that MySQL doesn't seem to support word stemming. It turns out that the acts_as_fulltextable plugin does a gsub on search queries to add a trailing * to each search term. MySQL calls this truncation. The star ("*") works as a wild card. If you are aware of this feature you can use it to manually work around the absence of stemming, just search for house* and that will match both "house" and "houses". It will also match "household" which may or may not be what you want.
A documented gotcha that I ran into is that if you combine the leading plus sign and the trailing star (i.e. use both boolean AND and truncation) then stop words will not be removed from your query and thus you may end up with zero results. This is very annoying. What is the solution? I don't know. Maybe go with OR searches, live with the stemming issue (without the star), or find a way to remove stop words yourself. Stop words are supposedly configurable in MySQL.
The acts_as_fulltextable plugin was created because of stability issues with Sphinx and Ferret and because Solr was considered overkill. Lucene/Solr appears to be the most well documented, reliable, scalable, and flexible open source search engine out there. My guess is that Solr is still your best best if you are a search power user and your requirements are high. Because of its simplicity and ease of maintenance though, acts_as_fulltextable offers a pretty attractive lower end alternative.
Rails Plugin: mysql_requirement
I added the plugin mysql_requirement that allows me to check the encoding/charset settings of the MySQL server as well as its version when my Rails application starts up and abort otherwise.
Rails Tip: Configuration Parameters
The PeepCode Code Review PDF has some nice advice about how best to deal with configuration parameters in your Rails applications. Traditionally most of us have probably just stuck global Ruby constants in our environment.rb files, but there are more structured ways of doing it. I've started using the app_config plugin and it seems to work fine so far. To make sure I haven't forgotten to define a parameter in an environment I access my parameters via a custom config_param method:
# Method to supplement the app_config plugin. I want to crash early and be alerted if
# I have forgotten to define a parameter in an environment.
AppConfig.param(name) do
raise("No app_config param ' ' defined in environment , " +
"please define it in config/environments/ .rb and restart the server")
end
end
RailsConf Europe 2007: Rails and the Next Generation Web
A sponsored presentation at the conference that I think stood out in how professionally it was executed and also touched on interesting topics was the one by Craig R. McClanahan from Sun. As I learned later, Craig has one of the most impressive track records I've seen in the Java community, being the creator of Struts and a contributer to a wide range of technologies including Tomcat and JavaServer Faces. In his talk Craig was urging Rails developers to move beyond the traditional three tier architecture of web applications. In plain english this means moving away from the classic scenario of a dumb browser talking to a single Rails application connected to a single database. Three developments in the industry today that are moving us away from this architecture are:
- Moving application logic to the client with JavaScript and transfering data instead of markup
- Server side mashups - refactoring a monolithic big app into multiple small apps
- Massively scaled applications - shifting vertical scaling to horizontal scaling
Craig was urging Rails plugin developers to avoid hard wired dependencies on ActiveRecord. Examples of plugins that succeed with this are acts_as_authenticated, make_resourceful, and paginator. Craig ended with three pieces of advice:
- Leverage duck typing to provide functionality without assuming an underlying base class
- Expose services with REST
- Think of your application as an internal mashup
The presentation slides are available online.
Ruby on Rails Deployment on FreeBSD
I did a Ruby on Rails FreeBSD deployment for a client the other day and I thought I'd share what I learned in an instructional format here. Previously I had mostly deployed on Linux (Suse, CentOS etc.) and I was curious to see what the FreeBSD experience would be like. I googled for instructions and immediately found the RailsOnFreeBSD page in the Rails Wiki. Other than that I couldn't find much Ruby on Rails and FreeBSD specific instruction out there. Note - most of the instructions in this post are not specific to FreeBSD but are generic Ruby on Rails deployment steps for Unix.
We were migrating from Windows to FreeBSD and the goal was to eliminate single points of failure. We settled on two application servers with FreeBSD 6.2 on HP hardware, both running the web and app tiers in the vanilla Rails deployment stack, i.e. Apache 2.2.4, mod_proxy_balancer, Mongrel cluster, and Mongrel. A load balancer external to the Rails system would load balance between the two Apache servers. The database we use is MySQL 5 and it sits on a separate server. The idea is to add another db server with some form of MySQL replication. We have yet to decide which replication to use and recommendations are welcome. For deployment we use Capistrano 2.
The first thing I do on a FreeBSD server is to log in with the root user and change shell from C shell to bash:
cd /usr/ports/shells/bash make install clean chsh -s /usr/local/bin/bash root exit su - echo $SHELL
As a personal preference (or ignorance of vi maybe), I install Emacs. This is a good time to go grab a cafe latte, since the installation takes forever:
cd /usr/ports/editors/emacs make install clean
We then add the user that Capistrano will log in as and that Mongrel will run under - the deploy user:
# Make sure to choose the bash shell. You can keep the defaults for most of the other questions. adduser deploy
To be able to deploy with Capistrano without repeatedly being prompted for a password, we set up public/private key authentication:
# On the production server: ssh-keygen -t rsa # On your development server: ssh-keygen -t rsa scp ~/.ssh/id_rsa.pub deploy@production-server:.ssh/remote_key ssh deploy@production-server cd .ssh cat remote_key >> authorized_keys rm remote_key exit # Now ssh should not prompt for a password: ssh deploy@production-server
We edit ~/.bashrc and setup the environment for the deploy user. I think it's important to set RAILS_ENV to production. I configure the bash prompt and the history size (the number of shell commands listed by the history command) and my preferred editor. I also add some convenient aliases for accessing the log file and mysql:
export RAILS_ENV=production export PS1="[\u@\h:\w] " export HISTSIZE=10000 export EDITOR=emacs export PATH=$PATH:/usr/local/mysql/bin export APP="/var/www/apps/streamio/current" alias cdapp='cd $APP' alias logapp='tail -f $APP/log/production.log' alias restartapp='cdapp; mongrel_rails cluster::restart -C config/mongrel_cluster.yml' alias mysqlapp='mysql -h db.host.name -u db.user -pdb-password database-name'
To make sure the ~/.bashrc file is sourced, edit or create ~/.profile and add the following line to it:
source ~/.bashrc
We install sudo and give the deploy user sudo access. That way we can use sudo from Capistrano to restart the Apache web server that will be running as root:
# Install sudo cd /usr/ports/security/sudo make install clean emacs /usr/local/etc/sudoers # Uncomment wheel group pw user mod deploy -G wheel
Make sure the clock on the server is in sync by invoking "crontab -e" and add this line:
*/30 * * * * /usr/sbin/ntpdate ntp1.sp.se
The above line syncs the clock every half hour with an internet clock - an ntp server. In this case we use ntp1.sp.se, but you may choose a different npt server that is available in your country.
Now, finally, the time has come for us to install Ruby on Rails which is really the heart of our server (or where our hearts are as Rails developers at least). As indicated in the Rails wiki, we can use the rubygem-rails port for this. The port will install both Ruby (the programming language), RubyGems (the package manager for Ruby software), and Ruby on Rails (the web framework). The portinstall command in the Rails wiki didn't work for me, so I used make install instead:
# Update the ports tree - takes a long time... portsnap fetch ; portsnap extract # Install Ruby, RubyGems, and Rails cd /usr/ports/www/rubygem-rails make install clean # Check your version of the installed software # The versions given here are the ones I got, you may find later versions ruby -v => ruby 1.8.5 (2006-08-25) [i386-freebsd6] gem -v => 0.9.2 rails -v => Rails 1.2.3
Now that we have RubyGems at our fingertips, we can install Capistrano, Mongrel Cluster, and Mongrel in a snap:
gem install capistrano -y gem install mongrel_cluster -y cap --version => Capistrano v2.0.0 mongrel_rails --version => ** Ruby version is not up-to-date; loading cgi_multipart_eof_fix => Mongrel Web Server 1.0.1
To be able to control the version and the configuration I chose to install Apache from source, and I followed the instructions in the excellent Mongrel book by Zed Shaw:
mkdir /usr/local/src cd /usr/local/src # Visit http://httpd.apache.org and download httpd-2.2.4.tar.gz tar xzf httpd-2.2.4.tar.gz cd httpd-2.2.4 ./configure --enable-proxy --enable-proxy-balancer --enable-proxy-http --enable-rewrite \ --enable-cache --enable-headers --enable-ssl make make install # You can check the location of the httpd binary: /usr/libexec/locate.updatedb locate httpd|grep bin
Add the Apache startup script:
emacs /etc/rc.conf # Add the following line: httpd_enable="YES" ln -s /usr/local/apache2/bin/apachectl /usr/local/etc/rc.d/httpd # Start Apache /usr/local/etc/rc.d/httpd start # Fetch http://production.host.name in a browser. You should see the text "It Works".
We now configure Apache for our Mongrel server like it says in the Mongrel book:
cd /usr/local/apache2/conf emacs httpd.conf # Add one line: Include /usr/local/apache2/conf/rails.conf
Create the /usr/local/apache2/conf/rails.conf file with contents like this (make sure to query replace $app_name$ with the name of your Rails app, i.e. the basename of your RAILS_ROOT):
NameVirtualHost *:80
# Setup the cluster
BalancerMember http://127.0.0.1:8000
BalancerMember http://127.0.0.1:8001
BalancerMember http://127.0.0.1:8002
At this point it makes sense to restart Apache to make sure that the config file parses. We now finish up on the server by installing Subversion, MySQL client, ImageMagick, and RMagick:
cd /usr/ports/devel/subversion make install clean cd /usr/ports/databases/mysql50-client/ make install clean cd /usr/ports/graphics/ImageMagick make install clean gem install rmagick -y
If you are in a Windows environment you may want to install Samba:
cd /usr/ports/net/samba3 make install clean
You should now have the mount_smbfs command available for mounting Windows disks on your FreeBSD server.
We are now just about ready to deploy to our FreeBSD server using Capistrano 2 from our development machine. Before we do that though, let's create the directory on the FreeBSD server we'll be deploying to:
mkdir /var/www chown deploy /var/www
Now, make sure your database.yml is properly configured. Make sure you can connect to MySQL from the FreeBSD servers. Capify your Rails app if you haven't already:
cd RAILS_ROOT capify .
You now need to edit config/deploy.rb to fit your server. In particular, make sure you have the deploy_to variable set to "/var/www/apps/#{application}" and you have the proper roles set up. Here is a sample:
role :web, rails01, rails02
role :app, rails01, rails02
role :db, rails01, :primary => true
role :scm, rails01
Also, make sure to define the deploy:restart task to restart using Mongrel Cluster. Also, symlink in any shared files in a callback. Here is a sample from my deploy.rb file to get you started (don't copy it in it's entirety, that won't work):
namespace :deploy do
# ===========================================================================
# Mongrel
# ===========================================================================
"cd && " +
"mongrel_rails cluster:: -C /config/mongrel_cluster.yml"
end
%w(restart stop start).each do |command|
task command.to_sym, :roles => :app do
run mongrel_cluster(command)
end
end
# ===========================================================================
# Apache
# ===========================================================================
desc "Restart Apache web server"
task :restart_web do
sudo "/usr/local/etc/rc.d/httpd restart"
end
# ===========================================================================
# Deployment hooks
# ===========================================================================
desc "Copy in server specific configuration files"
task :copy_shared do
proxy_dir = " /vendor/plugins/reverse_proxy_fix/lib"
run <<-CMD
cp /config/database.yml.example /config/database.yml &&
cp /config/directories.rb.example /config/directories.rb &&
cp /config.rb.unix /config.rb
CMD
end
desc "Run pre-symlink tasks"
task :before_symlink, :roles => :web do
copy_shared
backup_db
run_tests
end
desc "Run the full tests on the deployed app."
task :run_tests do
run "cd && RAILS_ENV=production rake && cat /dev/null > log/test.log"
end
desc "Clear out old code trees. Only keep 5 latest releases around"
task :after_deploy do
cleanup
sleep 5
ping_servers
end
If you have your deploy.rb in check, you should now be able to run "cap deploy:setup" to setup the Capistrano directory structure on the servers, and finally the magic command to deploy to both of the FreeBSD servers:
cap deploy
Good luck!
Rails Migration Gotcha: Forgetting to set active_record.schema_format to :sql
If you rely on any SQL not supported by Rails migrations such as foreign keys - don't forget to set config.active_record.schema_format = :sql in your environment.rb. Otherwise your test database will be out of synch with your development database. I really would prefer if Rails used the SQL dump format for setting up the test database by default as that would be more reliable.
The reason I came across this now was actually that I found what seems to be a bug with Rails migrations. Given this create_table statement in my migration:
create_table :users do |t|
t.string :username, :null => false
t.string :role, :null => false
t.timestamps
end
Rails will create this statement in the dump file:
create_table "users", :force => true do |t|
t.string "username", :default => "", :null => false
t.string "role", :default => "", :null => false
t.datetime "created_at"
t.datetime "updated_at"
end
The difference is subtle but important to me. In the dump format Rails insists that the default value is the empty string. This is not what I want when I say that a column is not null, since with MySQL 5, even with SQL mode set to traditional, the column will then be defaulted to the empty string when someone tries to set the column to NULL.
RSpec Tip: Keeping Controller Specs DRY
I am not yet converted to the idea of testing/specing views in separation so I usually invoke integrate_views at the top of my controller specs. I also have a bunch of helper methods that I want to reuse across my specs. To encapsulate those needs and DRY up my specs I came up with this little method that I keep in my spec/spec_helper.rb file:
# Describe a controller the way we want to do it for this app, i.e. with
# views integrated and with certain controller spec helper methods available
describe controller do
include ControllerSpecHelper
integrate_views
block.bind(self).call
end
end
The controller_spec_helper.rb file looks like this:
end
Rails Discussion: ActiveRecord vs SQL
Here is my Rails quote of the day, form a Rails mailinglist:
"When I was at the first RailsConf I was talking to someone about having used SQL for 15 years and that I was struggling with AR. At that moment someone grabbed me by the arm. I turned to look into the familiar face of Martin Fowler! He said to me, "If you know SQL that well you should just use SQL.""
The UPS Illusion of Guaranteed Delivery
My girlfriend was in a hurry to send 6 copies of her thesis (a 1 kg package) from Stockholm to Copenhagen. She sent it on wednesday and it said on the UPS website that the package would be delivered "no later than" end of office hours on thursday, i.e. the next day. My girlfriend chose UPS because she wanted to be sure the package would be delivered on time. Her examination is next friday and the printed thesis needs to reach the university well in time before that. For the delivery my girlfriend paid about three times as much as she would have paid the postal service. This means she paid about 600 SEK over the 200 SEK that the postal service would have charged.
Well, it's now Saturday and the UPS can't really tell us why the package has not been delivered yet. According to the UPS tracking service, the package has been sitting in the destination city for two days without any delivery attempts having been made. UPS cannot explain why. It's like the package is lost in limbo. When my girlfriend calls to explain the problematic situation she is in and asks for help she is not met by service mindedness or understanding. She is met by accusations and unfriendliness. They say she should have known to choose an even more expensive form of delivery for about 1000 SEK to be guaranteed the delivery date. They also say she misinterpreted the conditions. Apparently, when the UPS say "latest delivery" on a certain date, that means something else to them than it means to most people. The latest delivery date isn't really a part of the contract. There is no money back and there is no excuse when delivery is made at some arbitrarily later date.
Everybody knows the Postal service doesn't make guarantees about the delivery date. There is mutual understanding about the contract. A lot of us probably live under the illusion though that UPS makes guarantees like that. Well, it's time to wake up and smell the coffee because they don't. This naturally leaves the question open as to what it is that motivates the steep UPS prize premium?
RailsConf Europe 2007 Notes: Dave Thomas Keynote
The Keynote by Dave Thomas was to me the most inspirational and profound talk at the conference. I have the greatest admiration for Dave Thomas. I can't offer Dave's entire talk in every detail here, nor do I know if that would be appropriate. What I have is key fragments from sentences in the talk presented sort of in the format of a poem. Hopefully someone who didn't hear the talk can make something out of it. Enjoy...
There is no such a thing as software engineering What we have been doing is taking a whole lot of dirt and filling up a hole What makes engineering good? I look for elegance Fred Brooks - The Mythical Man Month Written 30 years ago. IBM mainframes. Not a single thing has changed between then and now, tragically The programmer, like a poet, creates castles from thin air. We can create anything we can imagine Writers block. Write something, throw it away, start over The blank canvas of a painter The blank editor buffer. Frightening. Since you start, you've made a commitment Leonardo was commissioned to produce a statue, how much money is there in it? He started with a sketch on a scrap of paper. He is doing what we should do but aren't doing He was prototyping, playing around, experimenting If you step away from the keyboard and use a different medium, you are engaging a different part of the brain The statue was canceled due to budget cuts... With a prototype you have not invested, so you can change your mind Brooks says be prepared to throw one away I say be prepared to throw 10 away You can ignore the details when you prototype Exploratory testing. When you don't really know how something works You write unit tests and experiment and align your understanding with what's going on Prototypes written as tests are really useful The end to end prototype. The tracer bullet. No error checking and no details Leonardo sometimes drew a composition and threw it away Scaffolding is good for prototyping - Start anyway. Start with something and then throw it away - Test first - Act on worry. The lizzard at the back of our minds is giving you a sense of worry. Listen to it Just as important: know when to stop How do you work on something that is really big if you can only see a small part of it at a time? It's like painting a big wall or ceiling The trick was to divide the ceiling into panels with low coupling and strong cohesion Each panel tells only one story Modularization is great and we don't do enough of it Time slots. Iterations. Boxes to work in Do what cartoonists do if you can't finish - "To be continued..." Knowing when to stop is incredibly important When you've set a boundary and you reach it - stop Satisfy the customer Paint my picture and make me look nice Develop the software and make me look good Increasingly less realistic portraits through history Each portrait satisfies, but only one of them really looks like a person There is a massive difference between a portrait and a picture Look inside and find out what something is really like and then find a way to express that A picture is merely the surface representation Painters don't have the restriction of photographers When you are capturing requirements you have a picture Developers are looking for underlying requirements USA was trying to invent a pen to write in space, meanwhile, the russions used pencils Nasa was using pencils. The Fisher company offered a pen. Nasa (the client) didn't want it It could write upside down in a boiling toilet. Nasa still didn't want it We need to get into the habit of *not* listening to our clients Instead figure out what they really want and work with them to establish that Software for producing a perfect drivers license picture bought by the state government Pictures got interchanged and smileys were being used instead The camera was off by one after a customer left Another customer put a smiley on a piece of cardboard... Is there a meaningful distinction between art and engineering? No. In reality they are the same They share elegance, grace, understanding, digging beneath the surface Is software engineering an art or a craft? There is no distinction Without engineering there is no art Without art there is no engineering Without art there is no soul and you won't produce anything worthwhile We have a responsibility to demonstrate that with Rails With Rails we have a canvas on which we can draw Be an artist Treat your next project as if it was a work of art Think about the impression that you want to make Create something great, as always The community is full of people that make great software and applications We are changing the world of software development Create more than great things. Create beautiful things Let the world know Sign your application as the artist That's how you know how to take pride in what you do
RailsConf Europe 2007 Notes: BDD, RSpec, and Story Runner
For me, RailsConf Europe 2007 in Berlin started out with a BDD/RSpec tutorial with David Chelimsky, Dan North, and Aslak Hellesøy. In the first part of the tutorial the presenters gave an overview of the theory and background of Behaviour Driven Development (BDD). All I can offer here is an unstructured list of keywords and notes that I was able to scrabble down while listening:
Domain Driven Design. Having engineers and business people speak the same language. SOA Using contracts that say: this is all I'm going to do for you Focusing on outcomes and reducing features The Agile Alliance The false civil engineering analogy of building bridges SCRUM Problem: programmers deleting tests they don't understand Format for user stories: index cards The given-when-then format Book: "User Stories Applied" All features should be traceable to a persona Keeping personas in FaceBook is a fun idea Example Driven Development
After the first break David Chelimsky demonstrated how RSpec is used to do Behaviour Driven Development. Suppose you are having a conversation with your customer about the application that you are going to build. It turns out you need a User class. In RSpec this is expressed as:
describe User do
end
When we run the specification we get a failure since the User class doesn't exist yet. We proceed to add the the User class, re-run the specifications, and get the green light. The green light means we can go back into specification mode and describe the behaviour of our class. The idea with RSpec is to have development of your application proceed in very short cycles including the following steps:
- Add one or two lines of RSpec specification code
- Run the specification and get the red light
- Implement the simplest possible code that will fulfill the specification
- Run the specification again and get the green light
Suppose your customer says users should be in the role to which they are assigned. In RSpec we express this as:
User do
it "should be in a role to which it is assigned" do
aslak = User.new
aslak.should be_in_role("speaker")
end
end
end
describe
Throughout the tutorial David Chelimsky used autotest to run the specs continuously in the background. David mentioned that refactoring means improving the design of a system without changing its behaviour. The hard question to answer is whether behaviour has been changed after a refactoring, and it's in the ability to answer that question that BDD/TDD can help us.
Story Runner is a new and complementary specification format that RSpec will provide in the next release. It is structured around scenarios with Given/When/Then blocks and the syntax looks a little something like this:
Story "Plan cup", %{
Narrative here
}, :type => RailsStory do
Scenario "set up a 4 team cup structure" do
Given "a new cup with max teams of", 4 do |max_teams|
end
When "I ask to see it" do
end
Then "it should a row count of", 3 do
end
Then "..." do
end
end
end
Stories live in a directory called stories next to the specs directory that RSpec already provides. The specifications deal with the units/components of the system. Stories on the other hand are higher level and are a form of acceptance/integration test that makes sure that the system works as a whole.
RailsConf Europe 2007
I sometimes tell people how the best decision I made in life was to get into salsa dancing. It was how I met my girlfriend Janne and it has simply been countless hours of fun and magic. It has also made my confidence and social skills grow which has helped me in other areas of life as well. Visiting RailsConf Europe 2007 has to rank right up there with one of those great decisions and it feels like a milestone. It gave me inspiration on many levels and I enjoyed it more than I've enjoyed any previous conference. What made all the difference for me was the networking. Meeting so many friends and colleagues from the past and having the opportunity of talking to thought leaders in the Rails ecosystem is just amazing. Now that three intense days of excitement is over I miss it already.
My trip to Berlin started out with meeting Sven Guckes and Emily at the Hauptbahnhof. Sven took us to places like the Bundesrat, the Brandenburger Tor, Checkpoint Charlie and the Sony Center. Sven may very well be the most hospitable person I've met and on top of that he is a very competent Berlin guide. Many thanks goes out to Sven for taking so good care of me and Emily! I spent most of the weekend (the night time) dancing salsa at the Berlin Salsa congress. I met four other Swedes at the congress - Björn, Johan, Camilla, Lala, and Elias. I had great dances with Dace and Anja from Riga, Alex, Sarah, and Anisa from Germany as well as girls from Ireland and Zurich etc. It was a small to mid-size congress (smaller than the Hamburg, Zurich, and UK congresses) with good atmosphere and plenty of amazing dancers. There was no live band, but the music and locations were good. Especially the gala night on saturday had a touch of flair and there was a performance by Salsa Dance Squad from the Netherlands that was pure magic. It's fascinating how after having seen so many different great shows and performances, all of a sudden a single show comes on that stands out like a divine work of art and makes all the others fade into the background.
When I started out the conference bright and early on monday morning at the Maritim hotel I was already quite exhausted after three straight nights of dancing. What I probably didn't realize at that point was that the intensity was not going to go down, on the contrary. On the day before the conference I had met up with my good friend Jarkko from Finland and with Dave Goodlad, Garnet, and Martin from Canada (from Vernon?) and we ended up at the Ständige Vertretung pub by the Friedrichstrasse station. Later on Rickard and David from Newsdesk (in Stockholm) joined in. The sun was shining on us, we were by the water and in great company, and the weissbier was flowing. It was relaxing end enjoyable. For me there is always some nostalgia in drinking weissbier as it brings back the year of 2000 when I was living in Munich.
In the evening four hundred or so computer geeks met up for the Bratwurst on Rails pre-conference party. The wurst was tasty, and I met many great people such as Marcel Molina, Chad Fowler, Patrick Lenz, Lauri Jutila from Finland, David Black, Christian Dalager from Copenhagen, Paul Doerwald from Canada, Tillmann Singer from Berlin, and Timo Hentschel and Bernd Schmeil from Munich. I also met people I know from the Stockholm Ruby User Group - David and Rickard from Newsdesk, Johan Lind from Valtech, Piotr from Connecta, Markus from Stockholm University among others. The Gothenburg Ruby User Group had decent representation as well and it was great to get to know Carl-Johan Kihlbom better.
Presentations that made a particularly deep impression on me were Dave Thomas on being an artist, Jason Hoffman on scaling from the bottom up, David Heinemeier Hanson on Rails 2.0 and the never ending quest for more beautiful code, and Marcel Molina and Michael Koziarski on coding best practices. Interesting was also David Chelimsky, Dan North, and Aslak Hellesøy on BDD and Rspec (including the new Story Runner), Craig R. McClanahan on moving beyond two tiers (avoiding a dependency on ActiveRecord), Ola Bini on JRuby (book coming soon), Britt Selvitelle (Twitter) on really scaling Rails (message: don't worry about it until you need it), and Dane Avilla on the RaiLint plugin for HTML validation and on using Watir for in-browser testing.
I didn't attend all the presentations at the conference, partly because I was exhausted and concentration was waning, but also because I didn't want to miss the opportunity of hanging out in the exhibit area talking to passionate people in the community. The exhibitors at the conference were great. Sun was there with the Netbeans IDE and JRuby, Borland announced the 3rdrail IDE based on Eclipse, EngineYard was there to talk about hosting, and FiveRuns showed off their RM-Install and RM-Manage offerings. Thoughtworks explained how 40% of their new enterprise projects in the US are based on Rails and how they are working on making Ruby cross the chasm and rid the IT world of bloatware (mostly .NET and J2EE as it were). I had the opportunity of mingling with people such as David Heinemeier Hanson, Martin Fowler, Roy Singham (founder of ThoughtWorks), Tom Mornini of EngineYard, and Bradley Taylor of RailsMachine.
On the last night of the conference I was invited to a special Tapas dinner at a spanish restaurant that Bradley Taylor arranged for his European clients. It was a pleasure. Before finally going home to sleep on wednesday night, we stopped by one last time at the Maritim hotel where Marcel Molina, Michael Koziarski, Chad Fowler, Geoffrey Grosenbach, and 10-15 other Rails enthusiasts were sitting in a circle playing the werewolf role playing game. The plot is something about a village with peasants that need to figure out who the werewolfs are before being killed by them. It takes figuring out when people tell the truth and there is a lot of strategy and drama involved. It was fascinating to watch. On thursday morning I had breakfast at Starbucks and ran into Swami from the night before - a traveling Yoga teacher from France and a Railsmachine customer. Swami is planning to open a Yoga center in Slovakia. Quite fascinating.
My photos from Berlin are avaliable on Flickr
O'Reilly will be hosting the RailsConf Europe 2008 in Berlin too. I look forward to coming back then. Thanks everybody and auf wiedersehen!
Rails Deployment Tip: FiveRuns.com
If you are a Rails developer you have probably heard of FiveRuns with their RM-Manage service offering, providing monitoring and statistics for your Rails apps. I've wanted to try the service for quite a while and today I finally got around to doing so on a production server at one of my clients. All I can say is you have to try it to believe it - it's a super slick, user friendly, and powerful service.
There was a minor glitch in the setup. The FiveRuns monitoring client didn't find our Rails app since it was in a non-standard location. I entered the FiveRuns Campfire chat room and got live support within minutes and quickly had a resolution. Once we installed the FiveRuns Rails plugin and restarted the server we had live Rails response time and profiling statistics within minutes. The first impression I have of the service is just great. We will be evaluating the service over the next 30 days and I'll post here any new findings we make. Barring any really big issues coming up, I consider FiveRuns to be a huge value add to the Rails ecosystem, almost like a milestone, and I will be recommending it wholeheartedly to all my clients from now on.
Rails Tip: Nested Layouts
This is just a feature I always wanted in Rails - nested layouts, i.e. the ability to have one master layout (application.rhtml) for your whole site, and then to have layouts within that that differ from section to section. Today I stumbled across this hack - just a single helper method that allows the use of nested layouts. Sweet. I'll re-post the code here:
end
Rails and Transactional Saves
Correction: ActiveRecord::Base#save does use transactions in Rails by default, see Jarkkos comment.
In Rails, models are not saved within a database transaction by default. This means that if you are updating some other record in a callback like after_save then the record will still be saved even if an exception is raised in the callback. If you care a lot about data integrity this is not satisfactory. One way to remedy the problem might be to simply override the save and save! methods in your model like this:
That's the workaround I'm going to use for now.
Ruby as a Bash Substitute
It's really nice how Ruby can liberate me from languages that I don't like as much and am not as fluent in, such as Bash, and Perl. Today I was tearing my hair over a little bash database backup script that I needed to write until I realized I'm much better off writing it in Ruby. Bash translates amazingly easy to Ruby with the syntactic similarity, the availability of the back ticks (`), and libraries such as FileUtils. Once I switched to Ruby the backup was solved faster and with more confidence.
This experience got me thinking of the expression "when all you have is a hammer, everything looks like a nail". If I'm not mistaken the Pragmatic Programmers use not only the advice "use the right tool for the job" but also "master your tools". There is a pretty obvious contradition between those two pieces of advice. It's important to not be afraid of picking up new tools, but it's also important to not underestimate the cost involved in learning the new tools in depth. Some tools I seem to pick up and learn quickly, others I can struggle with for years without ever really feeling confident or happy with them.
Rails Testing: Switching from Test::Unit to RSpec, What are the Advantages?
It's been a couple of months now since I switched from Rails built in Test::Unit framework to using RSpec and I've realized that RSpec has already brought some major advantages for me. Just to clarify, I still run all my old Test::Unit tests and I still write Test::Unit integration tests to complement my specs. The Rails testing framework and RSpec can run nicely side by side through the rake command. Here is what RSpec has given me:
- It has made me more spec/test driven. Now for the first time I mostly write the specs before I write the code. I've found that if I postpone writing tests the test coverage will suffer a lot. A lot of times the tests don't get written or I end up writing just the most rudimentary tests. Tests/specs need to be written while you are in the process of writing the code. That's when you have the domain logic loaded into your head. Push it off for a few days, or even just half a day, and you will need ramp up time to load the logic into your head again. A lot of times, there is no room for such ramp up time.
- Specs are more accepted and appreciated by clients than tests. I have clients who appreciate and really understand specs (in Word documents or PDFs) but are not in the tradition of test driven development. It makes more sense to them that I spend time writing specs, especially since I can show them the nice green HTML spec doc that RSpec generates. Clients who want to control and monitor the details of what you do will like your specs even more. Specs communicate better than tests.
- RSpec is syntactically nicer than Test::Unit and thus it reads better.
Another realization that has come to me lately, is that yes, regardless of what code you are writing TDD/BDD is a good idea. However, if you are writing any kind of API or framework that is to be reused across several applications, tests/specs and well defined interfaces are almost a necessity. If you don't have them, you might be better off opting for no reuse. Reuse is the holy grail of software development, and it's hard.
Rails Gotcha: The Contract of 1.month.ago
At first glance the contract of the beautiful 1.month.ago construct in Rails is maybe obvious. However, what should it return if the current time is the 31st of March? Remember that february only has 28 days. Well, according to Rails 1.2.3 the answer is 1st of March, but according to Edge Rails the answer is (and I agree with Edge Rails here) 28:th of February.
The 1.month.ago issue is something that bit me and that you should be aware of. So with Rails 1.2.3, if you want to be sure to get the beginning of the previous month, you need to say 1.month.until(Time.now.beginning_of_month). At least that's the workaround I'm using now.
Rails Deployment: Rapid Setup with the Machinify Gem
I used the Machinify Gem by Bradley Taylor (Mr RailsMachine) the other day to install a Rails stack on a piece of Ubuntu 7 Xen VPS and I must say it worked really nicely. The Gem uses rake to install the common stack of MySQL 5, Apache 2 with mod proxy load balancer, Mongrel Cluster, and Mongrel. The gem also takes care of installing all the obscure Debian packages needed that you don't even want to know about.
There is a small glitch currently in the Machinify gem in that the Mongrel Cluster version is hardcoded in its mongrel.rake file. Once I corrected that and ran rake stack:install again it worked like a charm. Related to this issue, I was trying to find out how to fetch the current version of a Gem and the best I could come up with was:
# On the command line: gemwhich capistrano => /usr/local/lib/ruby/gems/1.8/gems/capistrano-2.0.0/lib/capistrano.rb # In Ruby: `gemwhich capistrano`[%r{-([0-9.]+)/lib/[^/]+$}, 1] => "2.0.0"
If you are on Cent OS you should check out RubyWorks by ThoughtWorks (the installer, not the singer).
When the Swedish Post is Vastly Superior to UPS/TNT/DHL et al
It's a curious thing. When it comes to politics I'm traditionally a quite liberal and market oriented person. I usually get annoyed when I find areas of society were competition and free trade are restricted and I'm not a great fan of state monopolies. However, when it comes to delivering packages the swedish state owned postal service are offering me a service that is vastly superior to that of TNT/UPS/DHL.
Why is that? Well essentially it's because the postal service uses the approach of delivering to a pick-up place in my area, usually a kiosk or store of some kind, usually with generous opening hours. This has worked like a charm for me so far. Private delivery firms (UPS et al) try to deliver to me in person. Delivering a package directly into a persons hand sounds like a great service, but it's a bit harder than it sounds. I'm not saying delivering to someone in person cannot be made to work reliably, however, the way it's currently implemented there are some serious issues. For a personal delivery to be a success, it requires that the delivery person and the recipient be in the the exact same place at the same point in time, so that the package can be handed over and signed. Now in this day and age with GPS systems and mobile phones and the internet and all that, the problem sounds solvable. In a lot of cases, at least for me, delivery is failing though. For a taste of other peoples experiences, see Lars Pind wishing for a doorman or this swedish discussion saying postal service seems like a luxury. Some reasons personal delivery is failing might be:
- No appointment. The delivery company will not give you a notice about when they are showing up, they just show up and knock at your door. After, the first delivery attempt (for me, always a failure so far), they may or may not be able to tell you what day they will try next time, although they probably can't give any guarantees, and they certainly can't give you an exact time.
- No communication to prevent failure. If the delivery person shows up and you are not there, he doesn't (from my numereous experiences with TNT/UPS/DHL) call you. He will just leave a note saying delivery failed. End of story.
- No direct communication after failure/inefficient organization. I have never been able to talk directly to the driver to make an appointment/change address etc. In the cases where I have talked to the driver and made agreements they have not been met (delivery was not made in the agreed time interval). I have then been corrected and informed that I need to talk to customer support about deliveries, never with the driver directly. In the case of Apple Store it's even worse. You cannot talk directly to the delivery company and change an address, you have to go via Apple Store. After the first failed delivery to my home I called Apple same day and changed address. However, since Apple needs a full day (!) to communicate the address change to UPS, the second delivery attempt the following day was again made to my home address instead of the work address that I changed it to.
- No convenient pick-up/fall back. If delivery fails three times you have to pick up the package at some far off and inconvenient industrial area outside your town (Stockholm in my case) that has restricted opening hours and which would basically require you to make an excursion there and take time out from your day job. If you cannot make it to pick up the package it is returned to the sender and the delivery failure is completed. Yes, this has happened to me.
RSpec Gotcha: Incompatibility with Engines 1.2.0
If you are trying to get RSpec to work in a Rails application that uses Engines you may find that RSpec errors out when you try to run it. It seems the reason is related to the file testing.rb in the engines plugin. If you comment out the line that requires that file (in vendor/plugins/engines/init.rb) you will have worked around the issue:
#require "engines/testing" if RAILS_ENV == "test"
Rails Deployment: A Little Checklist
After having fought myself through a couple of Ruby on Rails deployments, it seems there are a lot of little annoying things you need to remember. Some of them are not super critical, at least not in the short run, but they can grow into bigger problems over time. Here is the list:
- Backups and failover. I put this first because I think it is so important. What happens if a hard drive or server crashes? Can you fail over to a different server. How long will this take? Are your backups working? Have you tried doing a recovery? Backing up a MySQL database can be as easy as doing mysqldump from a cron job. You can then use rsync to copy those backup files over to a different server. Remember to also backup any data files that live outside the database. such as media files that you have symlinked under the public directory.
- Documentation. Document how your server is set up and configured. What programs are installed where, where are the config files, how do you restart the servers? This is useful since human memory is short and unreliable and also since we may need to introduce new sys admins and developers to the project.
- Monitoring. You should use an external monitoring server that is entirely external to your system that alerts you by SMS or otherwise when your server is down. You may add to this an internal job that pings your server and restarts your mongrels if they are not responding. You should also monitor disk space and CPU and other parameters of your server. The FiveRuns.com service looks promising when it comes to monitoring.
- Setting the clock. On two recent deployments (Suse and Fedora) I've had issues with the clock being wrong. You can schedule the ntpdate command from a cron job to sync your clock.
- Clearing out old sessions. Set up a cron job to delete old records from the sessions table. I recently worked with a Rails app where this had not been done and the sessions table consequently had millions of rows.
- Clearing out old Capistrano releases. For some reason Capistrano defaults to keeping all releases forever. The fix is easy - add an after_deploy callback that invokes cleanup.
- Log rotation. If you don't rotate your log/production.log file it can quickly grow into hundreds of megabytes or even a few gigabytes. I don't know if that slows your server down, but such log files are not very easy to work with. You can use logrotate to do the rotation, see the Rails Wiki. Here is a sample from etc/logrotate.conf:
# Rails logs: /usr/local/rails_apps/platform/shared/log/*log { daily missingok rotate 100 compress delaycompress notifempty copytruncate create 0666 rails www }
Rails Deployment: Dealing with 404s
You probably use Jamis Buck's Exception Notifier plugin to be notified by email of exceptions thrown on your production server. Typically though you don't want to be notified of 404s, at least not when they are caused by an RSS reader requesting an RSS feed that no longer exists, or some spam robot or search engine fetching URLs fetching URLs no longer supported. When the Exception Notifier plugin catches an exception and it is one of ActiveRecord::RecordNotFound, ActionController::UnknownController, or ActionController::UnknownAction, it figures that it is a 404. By default then no email is sent and instead the method render_404 is invoked which in turn renders public/404.html. This is quite appropriate. Unofortunately though you can't trace in the log what the request looked like that caused the 404. The other limitation is that if no route matches a n incoming request, then an ActionController::RoutingError is thrown, yielding a 500 and an exception email. To get consistency in how 404s are dealt with, i.e. make sure they are always fully logged and tracable, but never cause email notifications, I override the render_404 method from the Exception Notifier plugin in my ApplicationController:
I then add a catch all route at the end of my routes.rb file:
map.connect '*anything', :controller => 'application', :action => 'render_404'
Rails Deployment: Looking up User ID from a Session ID
Suppose you get an exception notification and you want to know which user was browsing the site when the exception occured. The exception email has the HTTP_COOKIE header with the _session_id. Assuming you are using ActiveRecordStore and that you store the user ID in the session, you can look it up with an incantation like the following:
CGI::Session::ActiveRecordStore::Session.find_by_session_id('c5934a07b12ae9d90cf6be4e7d48b361').data
I keep forgetting the module path to the Session class so I thought I'd post this note. The corresponding Rails file is active_record_store.rb in ActionPack.
Rails Testing: Quoting angle brackets in assert_select
This is just a little detail but today I rediscovered the way to quote angle brackets in argument values in HTML::Selector expressions in your assert_select commands (and corresponding response.should have_tag commands in RSpec). I tried using back slashes before I realized that I needed to enclose the argument value in single quotes. Notice the nested angle brackets ([ and ] signs) below:
it "Should have the allow_public_messages radio buttons on the edit contact info page" do
get :edit_my_options
response.should be_success
response.should have_tag("input[name='profile[allow_public_messages]']")
end