Detecting faces in images with the Vision framework

Detecting faces in images with the Vision framework

Learn how to use the Vision framework to detect faces on images and draw a rectangle over them.

Face detection has become essential in modern computer vision applications, enabling features from camera augmented reality filters to secure payments, and its applications extend far beyond these uses. In photo applications, for example, face detection is an invaluable tool for managing personal photo libraries and Apple’s Photos app is a great example of this technology in action.

In today’s digital world, people rely on vast photo libraries and need intuitive and easy ways to find and sort images. Face detection enhances this experience, allowing users to browse individual faces instead of sorting through photos manually.

The Vision framework enables built-in face detection features, detecting facial regions, recognizing key landmarks on the face - such as eyes, nose, and mouth - and also differentiating between distinct identities with precision. All processing occurs on-device, keeping data secure and enabling fast, reliable results.

In this article, we will focus on detecting face rectangles using DetectFaceRectanglesRequest, the key function within Vision that allows us to identify and outline faces in photos.

Detecting the faces on an image

First, import the Vision framework to have access to its features.

import Vision

Create a function that receives the image to be analyzed as a parameter and returns an array of FaceObservation objects as a result.

func detectFaceRectangles(image: UIImage) async throws -> [FaceObservation]? {
    // 1. Image to be used
    guard let image = CIImage(image: image) else { return nil }
    
    do {
        // 2. Set up the request
        let request = DetectFaceRectanglesRequest()
        // 3. Perform the request
        let results = try await request.perform(on: image, orientation: .downMirrored)
        // 4. Get the results
        return results
    } catch {
        print("Encountered an error when performing the request: \\(error.localizedDescription)")
    }
    
    return nil
}

The function detectFaceRectangles(image:) works as follows:

  1. Declares a constant called image to store the image to be processed as CIImage;
  2. Sets up the DetectFaceRectanglesRequest object
  3. Use the perform(on:orientation:) method to run the rectangle detection request on the image. Since Vision’s coordinate origin is at the lower left - unlike SwiftUI’s upper left, setting the orientation to downMirrored aligns the coordinates with SwiftUI’s system.
  4. The request returns an array ofFaceObservation that stores a series of information about the faces detected.

Among the FaceObservation properties, the boundingBox is the one that stores as CGRect the information to draw the rectangle around the face detected.

Now declare a function that takes an array of FaceObservation objects as a parameter and returns a collection of CGRect.

func getRectangles(faceObservations: [FaceObservation]) -> [CGRect]? {
    
    // 1. An array where to store the CGRect
    var rectangles: [CGRect] = []
    
    for faceObservation in faceObservations {
		// 2. Filter the observations by its own confidence level
        if faceObservation.confidence > 0.7 {
		    // 3. Store the bounding box of the observation
            rectangles.append(faceObservation.boundingBox.cgRect)
        }
    }
    
    return rectangles
}
  1. Creates an array to store the CGRect of each result.
  2. Filter the observations by their confidence level.Each FaceObservation object in the collection has its level of confidence in the observation’s accuracy, a float value normalized from 0 to 1 when the most confident, meaning that not all the results detect a face.
  3. Store the rectangle representing each observation’s boundingBox in the rectangles array.

Drawing the rectangles on a View

To retrace the rectangle over each detected face, define a custom shape type that converts a  CGRect object into a Path one. If there is any rectangle to draw, the struct uses path(in:) to create a rectangular path and scales it to match the display area, ensuring accurate overlay positioning on the image.

struct FaceRectangleShape: Shape {
    var faceRectangle: CGRect?
    
    func path(in rect: CGRect) -> Path {
        var path = Path()
        
        if let faceRectangle = faceRectangle {
            path.addRect(faceRectangle, transform: CGAffineTransform(scaleX: rect.width, y: rect.height))
        }
        
        return path
    }
    
}

Here is an example of how to use all the code discussed until here in a SwiftUI view:

import SwiftUI
import Vision

struct ContentView: View {
    
    @State private var rectangles: [CGRect]? = nil
    
    var body: some View {
        VStack {
            Image("picture")
                .resizable()
                .scaledToFit()
                .overlay(
                    ForEach(rectangles ?? [], id: \\.self) { item in
                        FaceRectangleShape(faceRectangle: item)
                            .stroke(Color.blue, lineWidth: 2)
                    }
                )
            
            Button("Detect Face Rectangles") {
                drawRectangles()
            }
        }
        .padding()
    }
    
    private func drawRectangles(){
        Task {
            do {
                guard let faceObservations = try await detectFaceRectangles(image: UIImage(named: "picture")!) else { return }
                rectangles = getRectangles(faceObservations: faceObservations)
            } catch {
                print("Error detecting contours: \\(error)")
            }
        }
    }
    
    private func getRectangles(faceObservations: [FaceObservation]) -> [CGRect]? {
        var rectangles: [CGRect] = []
        
        for faceObservation in faceObservations {
            if faceObservation.confidence > 0.7 {
                rectangles.append(faceObservation.boundingBox.cgRect)
            }
        }
        
        return rectangles
    }
        
    private func detectFaceRectangles(image: UIImage) async throws -> [FaceObservation]? {
        guard let image = CIImage(image: image) else { return nil }
        
        do {
            // Set up the request
            let request = DetectFaceRectanglesRequest()
            // Perform the request
            let results = try await request.perform(on: image, orientation: .downMirrored)
            return results
        } catch {
            print("Encountered an error when performing the request: \\(error.localizedDescription)")
        }
        
        return nil
    }
}
  • Create a variable state property called faceRectangles to store the bounding boxes of detected faces in the image.
  • Create a button labeled "Detect Faces" that triggers face detection by calling the detectFaceRectangles(image:) method, which populates faceRectangles with the detected faces.
  • Use the FaceRectangleShape object to overlay rectangles on the image, drawing face boundaries whenever the faceRectangles state property is updated.

The combination of Vision’s robust detection capabilities with SwiftUI’s interface-building tools enables not only precise face localization but also custom overlays, adding significant value to photo and media-based applications across Apple’s ecosystem.