Wednesday, February 27, 2008

iPhone programming tips: image orientation

One of our main focus when we started to write Sketches was to leverage as much as possible the existing technologies already available in the iPhone. This is not a new idea: just as all Mac OS X apps cooperate with each other, we thought that integrating Sketches with the rest of apps in the system would make it a far more fun and pleasing app.

Given the complete lack of documentation or experience, trying to figure out how the APIs work and fit together with each other was no easy endeavor. However, we bravely set out to study the Cocoa iPhone APIs, armed with our previous knowledge of desktop Cocoa, and a lot of patience. Sometimes we thought the effort might be overkill, but being perfectionists at heart, we were uneasy to deliver something that we ourselves considered suboptimal. We are satisfied with many of the results we got, and people seem to appreciate them too. In fact, we've had several fellow coders ask us about details in our image handling code, or how we manage to send emails from Sketches without actually launching MobileMail.

One of the areas we are frequently asked about is the way we handle images shot from the camera or picked from the photo album, and how we detect their orientation. Instead of answering the questions privately, we thought it could be interesting for others to post them here. Maybe our explanations will become outdated next week after The Event - if so, that'd be great news for all :)

In the following paragraphs I'll try to describe some of the interesting "features" I found when guessing how to use the Camera and Photo Album APIs, with a special focus on detecting orientation. I may be utterly wrong in my interpretations, but so far they have been working reasonably well for Sketches :)

* Warning: technical burble ahead *

The first idea that comes to mind if you want to have the camera shoot pictures and use them in your app, is to use the cameraController:tookPicture:withPreview:jpegData:imageProperties callback invoked by the CameraController class. It looks promising, but after some experimentation, we found this approach was not as flexible as we required. Instead, we are using a different mechanism in Sketches. We subscribe to the CameraImageFullSizeImageReadyNotification event, which is sent just after a new photo has been taken. Early in our development we decided to subscribe to all notifications and log them in a file, a simple technique that was instrumental to learn about the existence of this particular notification. In our code, the method that receives the notification is similar to the following:

- (void) cameraImageFullSizeImageReadyNotificationObserver: (NSNotification *) notification
NSLog( @"cameraImageFullSizeImageReadyNotificationObserver" );
[cc stopPreview];
CameraImage * cameraImage = (CameraImage *) [notification object];
if ( cameraObserver && [cameraObserver respondsToSelector: @selector(imageWasSelected:)] )
[cameraObserver performSelector: @selector(imageWasSelected:) withObject: cameraImage];

The important piece of information here is that the "object" property of the notification contains an instance of the CameraImage class, which belongs to the PhotoLibrary API framework. In a moment we'll show how to deal with CameraImage instances.

For photos picked with the photo album browser, we use the PLUIController class. A line like

[[PLUIController sharedInstance] setDisplayingPhotoPicker: YES];

will trigger selection of a photo from the photo album. If you set a delegate object, the photo album will invoke your delegate's imageWasSelected method, indicating as a parameter the CameraImage instance that corresponds to the image selected. Note we are using the shared PLUIController instance and do not instantiate a copy ourselves - results are more difficult to achieve the other way around, because the constructor of the shared object actually knows how the instance has to be instantiated, while we have no idea about the properties we should set to make it work.

Therefore, for both types of photo selection (shooting with the camera or selecting a photo from the photo album), we end up with a CameraImage instance to deal with. A CGImage can be created from the CameraImage, using the following API call:

int orientation;
struct CGImage * cgimage = [image createFullScreenCGImageRef: &orientation];

Now we have a Core Graphics image, which is a good thing because there's actually a whole lot of documentation about that.

However, this is where things get a bit hairy regarding how to determine image orientation.

First of all, there are three different sets of values that refer to orientation characteristics:

- If you copy photos from your iPhone to your computer and look at them in an EXIF browser, you'll see that they contain one of the following values: "1" for "normal" images; "3" for images that are rotated 180 degrees; "6" for images that are rotated 90 degrees; and "8" for images that are rotated 90 degrees counter-clockwise.
- The orientation value returned in the integer above does not correspond to the EXIF values just mentioned. The values returned by createFullScreenCGImageRef are, respectively: "1", "2", "3", "4". To make things amusing, images shot with your phone in the vertical upright position will typically be stored to disk with a 90 degrees CCW rotation; therefore, the orientation value you'll receive if you pick such a photo from the photo album will be "4".
- Third, the device orientation, as reported by UIHardware, uses a different set of values. They are the following: "1" (phone vertical, upright position); "2" (phone upside down); "3" (phone rotated 90 degrees CCW with respect to the vertical position); "4" (phone rotated 90 degrees CW).

Therefore, after you select a photo from the photo library, you should check the orientation value you get when creating the CGImage reference, and then you have to correct the rotation using the rotation and translation transformations provided by the CoreGraphics API.

For photos shot with the camera, however, the orientation value returned by createFullScreenCGImageRef at that particular callback is always "4", no matter what the device orientation is, or the way the photo is stored in the disk. This probably happens because my initialization of the camera related APIs is possibly incomplete, but I haven't figured out how to achieve a better result. For shots, therefore, what we do is use the device orientation value, then rotate the photos according to the description above.

Another option I tried was to open the physical files that refer to the CameraImage and read their properties. I was hoping to get the EXIF orientation value, but it seems that at this stage not many properties are available. In fact, code like the following yields a dictionary with just a "FileSize" property:

NSString * imgPath = [[image fileGroup] pathForFullSizeImage];
CGImageSourceRef imageSource = CGImageSourceCreateWithURL( (CFURLRef) [NSURL fileURLWithPath: imgPath], NULL );
NSDictionary * imageProps = (NSDictionary *) CGImageSourceCopyProperties( imageSource, NULL );

We are using these techniques in the current version of Sketches, and will also be the basis for some new features we are still cooking up. We hope they are useful to other iPhone programmers too!


pradeepta dash said...

Very useful. Apple should add some api to introduce exif data to the raw jpeg NSData taken by tookPicture callback.

Anonymous said...

Thanks very much for this important bit of information. I'm just confused about a few things so I have some questions:

1. When you write "images shot with your phone in the vertical upright position will typically be stored to disk with a 90 degrees CCW rotation; therefore, the orientation value you'll receive if you pick such a photo from the photo album will be '4'.", do you mean that the data is actually rotated by 90 degrees CCW before being written, or just that the flag is set to 4? If it's the latter, then isn't 4 the appropriate flag and therefore what you'd expect? If it's the former, then shouldn't the flag be set to 1?

2. Is there a lossless way to rotate the jpeg data so the orientation can be corrected before writing to disk?

3. Is there a way to modify the exif orientation data?

4. Slightly off topic: I haven't yet tried the notification so I'm not sure if the same thing happens there, but when I use the tookPicture callback, the code executes asynchronously wrt the 1/2 to 1 second that the preview freezes. So if I popup an alert, say to confirm the picture taken, it pops up before the freeze finishes and the resulting image can be displayed. Is there something like a notification for completion of the freeze? Or someway to do what's done in the native camera app where the shutter closes and opens at the right time? Maybe you have a suggestion.

Thanks again for the importing info and for reading my questions.


Pedro said...

Hi Marc,

Sorry for not having answered before, I missed the mail notification for some reason. Anyway, it's good to know that this pre-SDK conclusions are still of interest to some people :) I'll try to share as much as I know regarding your questions.

1. Images are actually stored in the rotated state. I have to reverse the rotation if I need to present the photos in their natural position. Interestingly, if you set up an application that shoots a picture using the Camera APIs described here, you'll find that the image will be shown rotated in the photo album. This means that the correct orientation value was not saved alongside the image data. As mentioned in the post, this is probably caused by an incorrect initialization of data structures when using the API.

2. I use Core Graphics to manipulate the images and reverse the rotations whenever I need to. The basics can be found here:
I am not aware of any other lossless and/or faster mechanism to achieve the same results.

3. When saving the JPEG data to disk, you can specify a dictionary of properties you want to associate to the image. The orientation data can be specified using the kCGImagePropertyOrientation dictionary key. Take a look at the documentation for CGImageDestinationAddImage for more info. However, I don't know if it is easy to modify the existing EXIF data of an image already saved.

4. There are animations to open and close the Iris in the CameraView class, that take the appropriate amount of time to render. However, I'm not using that at the moment and I'm not sure how to set it up, so you may want to experiment a bit :)
There are several notifications related to the shooting process. Initially I was subscribing to "PictureWasTakenNotification", but then I found that "CameraImageFullSizeImageReadyNotification" was more appropriate in my case. You might want to log all notifications your app receives and see if there's another one that fits you best.
Also, there are several delegation and callback mechanisms - it might be a good idea to override "respondsToSelector" in your delegate objects to learn about the methods that are expected by the camera APIs.

I hope that helps. Good luck with your project, and thanks for writing!

ap said...


First, thanks for the post, lots of great information. One question though, maybe I missed something important but you said that you were able to subscribe to and log all notifications. How do you do this?

This sounds like it would be extremely useful for what I am trying to do.


ap said...
This comment has been removed by the author.
stevie w said...

There is an open source objective-c static library for manipulating EXIF directly on the iphone (even saved images) - you can find it here (bit of self promotion really as I wrote it - but then it might come in useful for the stuff you are talking about here :-)


Living Like Crazy said...

Hi -- how did you get CGImageSourceCreateWithURL to work in an iPhone app???? I get a linker error!!!!

Anonymous said...

Hi -- how did you get CGImageSourceCreateWithURL to work in an iPhone app???? I get a linker error!!!!