
Core Image was one of the many interesting topics discussed at iOS Tech Talk Tour that took place in São Paulo on january 9th. It is a framework that was already available at the MacOS and now can also be used by iOS developers.
It is important to notice that this framework is available only after iOS 5.0, resulting in a use limited to the application requirements. However, according to CNET the percentage of devices using iOS 5 in November, 2011 was already 40%, showing that apps developed to this version will shortly be available to the majority of users
Facial recognition is, by far, the most interesting of Core Image’s features, which will be detailed in this article. This new technique allow developers to think about new apps using this concept with a very low implementation cost.
We will show you how to implement the facial recognition straight from the device’s camera data stream. The source code is based on Apple’s SquareCam example project.
The first step is to configure the camera using the AVFoundation Framework, available since iOS 4 release in a way we can directly read the device stream.
This configuration is made in order to use the following objects:
- AVCaptureSession – This object represents a session that coordinates the data flow from AV input devices to the output. In order to accomplish that, We add the input and output devices to this session object and start data flow using the startRunning messages (and stop it by using stopRunning).
- AVCaptureDevice – It is a physical device abstraction which provides an input for a AVCapureSession object. There is an object available for every input device type, for instance: there is one video input for iPhone 3G, but there are two of them for iPhone 4.
- AVCaptureDeviceInput – It is an AVCaptureInput subclass used to add and input device into a session (AVCaptureSession).
- AVCaptureOutput – It is an abstract class used to find a session output (AVCaptureSession).

- (void)setupAVCapture { AVCaptureSession *session = [AVCaptureSession new]; if ([[UIDevice currentDevice] userInterfaceIdiom] == UIUserInterfaceIdiomPhone) [session setSessionPreset:AVCaptureSessionPreset640x480]; else [session setSessionPreset:AVCaptureSessionPresetPhoto]; // Select a video device, make an input AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo]; AVCaptureDeviceInput *deviceInput = [AVCaptureDeviceInput deviceInputWithDevice:device error:nil]; if ( [session canAddInput:deviceInput] ) [session addInput:deviceInput]; // Make a video data output videoDataOutput = [[AVCaptureVideoDataOutput alloc] init]; // we want BGRA, both CoreGraphics and OpenGL work well with 'BGRA' NSDictionary *rgbOutputSettings = [NSDictionary dictionaryWithObject: [NSNumber numberWithInt:kCMPixelFormat_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]; [videoDataOutput setVideoSettings:rgbOutputSettings]; [videoDataOutput setAlwaysDiscardsLateVideoFrames:YES]; // discard if the data output queue is blocked (as we process the still image) videoDataOutputQueue = dispatch_queue_create("VideoDataOutputQueue", NULL); [videoDataOutput setSampleBufferDelegate:self queue:videoDataOutputQueue]; if ( [session canAddOutput:videoDataOutput] ) [session addOutput:videoDataOutput]; previewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:session]; [previewLayer setBackgroundColor:[[UIColor blackColor] CGColor]]; [previewLayer setVideoGravity:AVLayerVideoGravityResizeAspect]; CALayer *rootLayer = [previewView layer]; [rootLayer setMasksToBounds:YES]; [previewLayer setFrame:[rootLayer bounds]]; [rootLayer addSublayer:previewLayer]; [session startRunning]; }
Identifying a face with a CIDetector
According to CIDetector’s Class Reference the CIDetector object (available since iOS 5 inside CoreImage.framework) uses image processing to find “features” inside an image.
So, the next step is to identify the face in our video data stream is to configure a CIDetector. We can create an instance of the object by instantiating:
NSDictionary *detectorOptions = [[NSDictionary alloc] initWithObjectsAndKeys:CIDetectorAccuracyLow, CIDetectorAccuracy, nil]; faceDetector = [[CIDetector detectorOfType:CIDetectorTypeFace context:nil options:detectorOptions] retain];
When we previously initialized the camera, we configured our controller to act as a the video output stream’s delegate (videoDataOutput) at the line:
[videoDataOutput setSampleBufferDelegate:self queue:videoDataOutputQueue];
We can now implement the following method to read the video data stream:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection { … }
Finally, with a CIDetector’s instance and the video data stream, We can identify our face.
// got an image CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); CFDictionaryRef attachments = CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode_ShouldPropagate); CIImage *ciImage = [[CIImage alloc] initWithCVPixelBuffer:pixelBuffer options:(NSDictionary *)attachments]; if (attachments) CFRelease(attachments); NSDictionary *imageOptions = nil; // '6' identifies device on vertical position imageOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:6] forKey:CIDetectorImageOrientation]; NSArray *features = [faceDetector featuresInImage:ciImage options:imageOptions]; [ciImage release];
The array features have each element as an instance of a CIFaceFeature, which identify a new face found in the video and allow us to retrieve several information from it.
Image’s CIFaceFeature
A CIFaceFeature object properties describes the face found in a image. These properties are:
hasLeftEyePosition
hasRightEyePosition
hasMouthPosition
leftEyePosition
rightEyePosition
mouthPosition
Besides that, due to its CIFeature inheritance, it also has the following properties:
bounds – The rectangle that the feature was found inside
type – The feature type
By using this information several actions can be taken, such as adding new visual elements on top of the face which was found in the image.
CIDetectorAccuracyLow Vs CIDetectorAccuracyHigh
When we created our CIDetector, we provided the CIDetectorAccuracyLow parameter:
NSDictionary *detectorOptions = [[NSDictionary alloc] initWithObjectsAndKeys:CIDetectorAccuracyLow, CIDetectorAccuracy, nil];
The reason we did that is because we are trying to read from a video stream, and using this option results in a faster analysis for each video frame, however, with a higher chance of not detecting any face at all.
In general, the CIDetectorAccuracyHigh option is used to analyse a single picture, resulting in a slower processing time, but with a higher face detection rate.
As you can notice, iOS 5 made it extremely easy to integrate facial recognition, which allows us to think again in several features that would be impracticable to implement in a project before. That said, we still have to be aware of the project requirements, since not all the users updates their operating system to the latest iOS version.
This post is also available in portuguese here.