Ruby Best Practices

Page 201

For more complex needs, or for when you need to write a binary file, Ruby 1.9 has also changed the meaning of "rb" and "wb" in File.open(). Rather than simply disabling line-ending conversion, using these file modes will now set the external encoding to ASCII-8BIT by default. You can see this being used in Prawn’s Document#render_file method, which simply takes the raw PDF string written by Document#render and outputs a binary file: class Prawn::Document # ... def render_file(filename) File.open(filename,"wb") { |f| f << render } end end

This approach may already seem familiar to those who have needed to deploy code that runs on Windows machines. However, this section is meant to remind folks who may not have previously needed to worry about this that they need to be careful as well. There really isn’t a whole lot to worry about when working with files using this m17n strategy, but the few things that you do need to do are important to remember. Unless you’re working with binaries, be sure to explicitly specify the external encoding of your files, and transcode them to UTF-8 upon read or write. If you are working with binaries, be sure to use File.binread() or File.open() with the proper flags to make sure that your text is not accidentally encoded into the character set specified by your locale. This one can produce subtle bugs that you might not encounter until you run your code on another machine, so it’s important to try to avoid in the first place. Now that we’ve talked about source encodings and files, the main thing that’s left to discuss is how to transcode input from users in a fairly organized way, and produce output in the formats you need. Though this is not complicated, it’s worth looking at some real code to see how this is handled.

Transcoding User Input in an Organized Fashion When dealing with user input, you need to make a decision regarding what should be transcoded and where. In the case of Prawn, we take a very minimalist approach to this. While we expect the end user to provide us with UTF-8-encoded strings, many of the features that involve simple comparisons work without the need for transcoding. As it turns out, Ruby does a lot of special casing when it comes to strings containing only ASCII characters. Although strings that have different character encodings from one another are generally not comparable and cannot be combined through manipulations, these rules do not apply if each of them consists of only ASCII characters. It turns out that in practice, you don’t really need to worry about transcoding whenever you are comparing user input to a finite set of possible ASCII values. Portable m17n Through UTF-8 Transcoding | 185


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.