Detecting faces in images with the Vision framework
Learn how to use the Vision framework to detect faces on images and draw a rectangle over them.
Face detection has become essential in modern computer vision applications, enabling features from camera augmented reality filters to secure payments, and its applications extend far beyond these uses. In photo applications, for example, face detection is an invaluable tool for managing personal photo libraries and Apple’s Photos app is a great example of this technology in action.
In today’s digital world, people rely on vast photo libraries and need intuitive and easy ways to find and sort images. Face detection enhances this experience, allowing users to browse individual faces instead of sorting through photos manually.
The Vision framework enables built-in face detection features, detecting facial regions, recognizing key landmarks on the face - such as eyes, nose, and mouth - and also differentiating between distinct identities with precision. All processing occurs on-device, keeping data secure and enabling fast, reliable results.
In this article, we will focus on detecting face rectangles using DetectFaceRectanglesRequest
, the key function within Vision that allows us to identify and outline faces in photos.
Detecting the faces on an image
First, import the Vision
framework to have access to its features.
import Vision
Create a function that receives the image to be analyzed as a parameter and returns an array of FaceObservation objects as a result.
func detectFaceRectangles(image: UIImage) async throws -> [FaceObservation]? {
// 1. Image to be used
guard let image = CIImage(image: image) else { return nil }
do {
// 2. Set up the request
let request = DetectFaceRectanglesRequest()
// 3. Perform the request
let results = try await request.perform(on: image, orientation: .downMirrored)
// 4. Get the results
return results
} catch {
print("Encountered an error when performing the request: \\(error.localizedDescription)")
}
return nil
}
The function detectFaceRectangles(image:)
works as follows:
- Declares a constant called image to store the image to be processed as
CIImage
; - Sets up the
DetectFaceRectanglesRequest
object - Use the
perform(on:orientation:)
method to run the rectangle detection request on the image. Since Vision’s coordinate origin is at the lower left - unlike SwiftUI’s upper left, setting the orientation todownMirrored
aligns the coordinates with SwiftUI’s system. - The request returns an array of
FaceObservation
that stores a series of information about the faces detected.
Among the FaceObservation
properties, the boundingBox
is the one that stores as CGRect
the information to draw the rectangle around the face detected.
Now declare a function that takes an array of FaceObservation
objects as a parameter and returns a collection of CGRect
.
func getRectangles(faceObservations: [FaceObservation]) -> [CGRect]? {
// 1. An array where to store the CGRect
var rectangles: [CGRect] = []
for faceObservation in faceObservations {
// 2. Filter the observations by its own confidence level
if faceObservation.confidence > 0.7 {
// 3. Store the bounding box of the observation
rectangles.append(faceObservation.boundingBox.cgRect)
}
}
return rectangles
}
- Creates an array to store the
CGRect
of each result. - Filter the observations by their confidence level.Each
FaceObservation
object in the collection has its level ofconfidence
in the observation’s accuracy, a float value normalized from 0 to 1 when the most confident, meaning that not all the results detect a face. - Store the rectangle representing each observation’s
boundingBox
in therectangles
array.
Drawing the rectangles on a View
To retrace the rectangle over each detected face, define a custom shape type that converts a CGRect
object into a Path
one. If there is any rectangle to draw, the struct uses path(in:)
to create a rectangular path and scales it to match the display area, ensuring accurate overlay positioning on the image.
struct FaceRectangleShape: Shape {
var faceRectangle: CGRect?
func path(in rect: CGRect) -> Path {
var path = Path()
if let faceRectangle = faceRectangle {
path.addRect(faceRectangle, transform: CGAffineTransform(scaleX: rect.width, y: rect.height))
}
return path
}
}
Here is an example of how to use all the code discussed until here in a SwiftUI view:
import SwiftUI
import Vision
struct ContentView: View {
@State private var rectangles: [CGRect]? = nil
var body: some View {
VStack {
Image("picture")
.resizable()
.scaledToFit()
.overlay(
ForEach(rectangles ?? [], id: \\.self) { item in
FaceRectangleShape(faceRectangle: item)
.stroke(Color.blue, lineWidth: 2)
}
)
Button("Detect Face Rectangles") {
drawRectangles()
}
}
.padding()
}
private func drawRectangles(){
Task {
do {
guard let faceObservations = try await detectFaceRectangles(image: UIImage(named: "picture")!) else { return }
rectangles = getRectangles(faceObservations: faceObservations)
} catch {
print("Error detecting contours: \\(error)")
}
}
}
private func getRectangles(faceObservations: [FaceObservation]) -> [CGRect]? {
var rectangles: [CGRect] = []
for faceObservation in faceObservations {
if faceObservation.confidence > 0.7 {
rectangles.append(faceObservation.boundingBox.cgRect)
}
}
return rectangles
}
private func detectFaceRectangles(image: UIImage) async throws -> [FaceObservation]? {
guard let image = CIImage(image: image) else { return nil }
do {
// Set up the request
let request = DetectFaceRectanglesRequest()
// Perform the request
let results = try await request.perform(on: image, orientation: .downMirrored)
return results
} catch {
print("Encountered an error when performing the request: \\(error.localizedDescription)")
}
return nil
}
}
- Create a variable state property called
faceRectangles
to store the bounding boxes of detected faces in the image. - Create a button labeled "Detect Faces" that triggers face detection by calling the
detectFaceRectangles(image:)
method, which populatesfaceRectangles
with the detected faces. - Use the
FaceRectangleShape
object to overlay rectangles on the image, drawing face boundaries whenever thefaceRectangles
state property is updated.
The combination of Vision’s robust detection capabilities with SwiftUI’s interface-building tools enables not only precise face localization but also custom overlays, adding significant value to photo and media-based applications across Apple’s ecosystem.