Apple Intelligence

Reading and displaying Genmoji in non-rich text formatted data context

Learn how to display generated emojis within a non-rich text context.

Antonella Giugliano

Dec 24, 2024 • 9 min read

Using Genmoji in rich text format context is pretty simple as NSAdaptiveImageGlyph , the type behind Genmoji, is natively supported in rich text view as explained in Enabling Genmoji in your app.

However, it may occur that Genmoji appears in plain text or any other custom format, so they need to be handled in these kinds of situations as well.

Since NSAdaptiveImageGlyph is an image with extra information for its formatting setting and adaptability in attributed strings, to manage it in a custom format context, it needs to be treated for what it is: an inline image embedded within an attributed string.

Displaying Genmoji in custom non-RTFD documents

To represent Genmoji in a non-RTFD context, they first have to be read, which means that the attributed string must be decomposed to extract the embedded image data and their associated ranges.

This ensures the preservation of the visual and functional representation of the image glyphs when storing, rendering, or transforming the content at a given point of the text they appear. Let’s see how to decompose an attributed string containing NSAdaptiveImageGlyph instances.

func decomposeAttributedString(_ attrStr: NSAttributedString) -> (String, [(NSRange, String)], [String: Data]) {
    // 1. The plain text from the attributed string
    let string = attrStr.string
    
    // 2. Where to store ranges and identifiers of the image glyphs
    var imageRanges: [(NSRange, String)] = []
    
    // 3. Prepare a dictionary Where to store image data using their unique identifiers
    var imageData: [String: Data] = [:]
    
    // 4. Enumerate through all attributes in the attributed string
    attrStr.enumerateAttribute(.adaptiveImageGlyph, in: NSMakeRange(0, attrStr.length)) { (value, range, stop) in
        
        if let glyph = value as? NSAdaptiveImageGlyph {
            // a. Get the unique identifier of the glyph
            let id = glyph.contentIdentifier
            
            // b. Store the range of the glyph and its identifier
            imageRanges.append((range, id))
            
            // c. Store the image data if it hasn't been stored already
            if imageData[id] == nil {
                imageData[id] = glyph.imageContent
            }
        }
        
    }
    
    // 5. Return plain text, ranges of the image glyphs, and image data
    return (string, imageRanges, imageData)
}

The decomposeAttributedString(attrStr:) method takes NSAttributedString object as a parameter and works as follows:

Extract the plain text from the attributed string.
Prepare to collect the ranges and identifiers of the image glyphs.
Prepare a dictionary to store image data using their unique identifiers.
Use the enumerateAttribute(_:in:options:using:) to enumerate through all attributes in the attributed string to find those of type adaptiveImageGlyph. For each NSAdaptiveImageGlyph found:
1. get its unique identifier;
2. store its range and its identifier;
3. store the image data if it hasn't been stored already;
Return the plain text, the ranges of the image glyphs, and the image data.

After decomposing the NSAttributedString, its content has to be displayed as inline text embedding images.

// Enum to represent inline components (text or image)
enum InlineComponent {
    case text(String)
    case image(UIImage)
}

To represent and handle the all the types of inline content to display, create an enumeration handling two cases:

String representing the text
UIImage representing the Genmoji

Create a function that renders a View based on the following parameters:

The InlineComponent
A CGFloat value for the font size

@ViewBuilder
func renderComponent(component: InlineComponent, fontSize: CGFloat) -> some View {
    // 1. Handle each type of InlineComponent
    switch component {
    case .text(let text):
    	// a. Render a Text view for the text component
        Text(text)
            .font(.system(size: fontSize))
    case .image(let image):
    	// b. Render an Image view for the image component
        Image(uiImage: image)
            .resizable()
            .scaledToFit()
            // c. Match the height of the image to the font size
            .frame(height: fontSize)
    }
}

Use a switch statement to handle each type of InlineComponent:
1. Render as Text when the component is of type InlineComponent.text.
2. Render as Image when the component is of type InlineComponent.image .
3. Match the height of the image to the fontSize.

Now, we need a function that returns a SwiftUI view displaying a mix of InlineComponent and takes as parameters:

A tuple containing the returning values from the decomposeAttributedString(attrStr:) function;
- String: the plain text extracted from the attributed string.
- [(NSRange, String)]: an array of ranges and their associated image identifiers.
- [String: Data]: a dictionary mapping image identifiers to their binary data;
The value of the font size as CGFLoat.

func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data]), fontSize: CGFloat) -> some View {
    // 1. Deconstruct the input tuple
    
    // 2. Array to hold the inline components (text or images)
    
    // 3. Convert the text to NSString for proper Unicode index mapping

    // 4. Iterate through the imageRanges to match text and insert images

    // 5. Add any remaining text after the last gemoji

    // 6. Combine the inline components into a SwiftUI view
    
}

Here are the implementation details of each step.

func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data]), fontSize: CGFloat) -> some View {
    
    // 1. Deconstruct the input tuple
    let (text, imageRanges, imageData) = decomposedAttributedString
    
    // 2. Array to hold the inline components (text or images)
    var inlineComponents: [InlineComponent] = []
    
    // 3. Convert the text to NSString for proper Unicode index mapping
    let nsText = text as NSString

    ...

}

Deconstruct the input tuple into three constants:
1. text: storing plain string with no images;
2. imageRanges: storing information about where images are located in the string - ranges and their identifiers;
3. imageData: storing the actual image data.
Declare an instance of an array where to hold the inline components, represented by the InlineComponent enum.
Convert the text to NSString for proper Unicode index mapping.

func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data]), fontSize: CGFloat) -> some View {

    ...
    
    var currentIndex = 0
    
    // 4. Iterate through the imageRanges to match text and insert images
    for (range, id) in imageRanges {
    
        // a. Add text before the current image or emoji
        
        // I. Ensure the current range's location is ahead of the last processed position
        // Where the next image starts - Where we are in the string
        if range.location > currentIndex {
            
            // II. Extract the substring from the last position up to the start of the genmoji
            let textSegment = nsText.substring(with: NSRange(location: currentIndex, length: range.location - currentIndex))
            
            // III. Wrap the extracted text segment in an InlineComponent of type .text
            inlineComponents.append(.text(textSegment))
        }

        // b. Add the image if the data is available
        // I. Retrieve image data for the current identifier
        if let imageData = imageData[id], let uiImage = UIImage(data: imageData) {
            // II. Wrap the UIImage in an InlineComponent of type .image
            inlineComponents.append(.image(uiImage))
        }

        // c. Update the currentIndex to skip over the range covered by the genmoji
        currentIndex = range.location + range.length
    }
    
}

4. Iterate through each range in imageRanges , representing where images appear in the original string:

Add text before the image if there is any before the current image:
1. Ensure the current range's location is ahead of the last processed position: range.location represents where the next image starts, currentIndex is where we are in the string;
2. textSegment extracts the substring from the last position up to the start of the genmoji, the text segment located before the image;
3. wrap the extracted text segment in an InlineComponent of type text and add it to the inlineComponents array.
Add the image if the image data exists and can be converted to an image:
1. retrieve image data for the current identifier;
2. wrap the UIImage in an InlineComponent of type. image and add it to the inlineComponents array.
update the currentIndex to skip over the part of the string where the image is located to avoid processing the same part of the string twice.

func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data]), fontSize: CGFloat) -> some View {
    
    ...
    
    // 5. Add any remaining text after the last gemoji
    // a. Ensure there's unprocessed text left
    if currentIndex < nsText.length {
        // I. Remaining text starting from the last processed position
        let remainingText = nsText.substring(from: currentIndex)
        // II. Wrap the remaining text in an InlineComponent - type text
        inlineComponents.append(.text(remainingText))
    }
    
}

5. Add any remaining text after the last image has been processed using the same approach of indices and text.

Ensure there's unprocessed text left:
1. extract the remaining text starting from the last processed position;
2. wrap the remaining text in an InlineComponent of type text and add it to the inlineComponents array.

func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data]), fontSize: CGFloat) -> some View {
    
    ...
    
    // 6. Combine the inline components into a SwiftUI view
    return HStack(alignment: .center, spacing: 0) {
    
        // a. Loop through the inline components and render each one
        ForEach(inlineComponents.indices, id: \.self) { index in
            let component = inlineComponents[index]
    
            // b. Render the view accordingly to the component
            renderComponent(component)
        }
        
    }
}

Combine the inline components into a SwiftUI view inside an HStack:
1. Loop through the inline components and render each one
2. Render the view accordingly to the component.

This is how we can integrate it in a SwiftUI view.

import SwiftUI

struct ContentView: View {

    @State var text: NSAttributedString? = NSAttributedString(string: "Start typing here...")
    @State var decomposed: (String, [(NSRange, String)], [String: Data])? = nil
		
    var fontSize: CGFloat = 16
		
    var body: some View {
        VStack(alignment: .leading) {
            // Custom Text Editor
            Section {
                CustomTextEditor(text: $text, fontSize: fontSize)
                
                Spacer()
                
                Button("Decompose") {
                    if let text = text {
                        decomposed = decomposeAttributedString(text)
                    }
                }
                .buttonStyle(.bordered)
                
                Divider()
                Spacer()
            } header: {
                Text("Custom Text Editor:")
                    .foregroundStyle(.gray)
            }
            
            Section {
                // Custom Text View to display the decomposed content
                if let text = decomposed {
                    buildInlineTextView(decomposedAttributedString: text)
                } else {
                    Text("No text decomposed to display")
                        .foregroundColor(.gray)
                }
                Spacer()
            } header: {
                Text("Decomposed Attributed String:")
                    .foregroundStyle(.gray)
            }
        }
        .padding()
    }
    
    // Decompose the attributed string
    private func decomposeAttributedString(_ attrStr: NSAttributedString?) -> (String, [(NSRange, String)], [String: Data])? {...}
    
    // Build the text with inline images from the tuple
    private func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data])) -> some View {...}
    
    // Enum to represent inline components (text or image)
    enum InlineComponent {...}
}


// Custom Text Editor
struct CustomTextEditor: UIViewRepresentable {
    @Binding var text: NSAttributedString?
    var fontSize: CGFloat
    
    func makeUIView(context: Context) -> UITextView {
        let textView = UITextView()
        textView.isEditable = true
        textView.allowsEditingTextAttributes = true
        textView.frame.size.height = UIScreen.main.bounds.height / 5
        textView.delegate = context.coordinator
        textView.supportsAdaptiveImageGlyph = true
        
        // Initialize with the current text if available
        if let initialText = text {
            textView.attributedText = initialText
        }
        
        textView.font = UIFont.systemFont(ofSize: self.fontSize)
        
        return textView
    }
    
    func updateUIView(_ uiView: UITextView, context: Context) {
        if uiView.attributedText != text {
            uiView.attributedText = text
            uiView.font = UIFont.systemFont(ofSize: self.fontSize)
        }
    }
    
    func makeCoordinator() -> Coordinator {
        Coordinator(self)
    }
    
    class Coordinator: NSObject, UITextViewDelegate {
        var parent: CustomTextEditor
        
        init(_ parent: CustomTextEditor) {
            self.parent = parent
        }
        
        func textViewDidChange(_ textView: UITextView) {
            // Update the binding
            if let currentText = textView.attributedText {
                parent.text = currentText
            }
            
        }
    }
}

0:00

/0:16

In this SwiftUI view a custom text editor that supports NSAdaptiveImageGlyph allows to store the user input in a state variable of type NSAttributedString. Whenever its value changes, its value gets decomposed and the views of its content are built and displayed.

In case we need to display content from a text embedding inline images, we can use the reverse approach by recomposing the NSAttributedString starting from its information.

Create a function that takes 3 parameters, processes them and returns a NSAttributedString value.

func recomposeAttributedString(string: String, imageRanges: [(NSRange, String)], imageData: [String: Data]) -> NSAttributedString {
    
    // 1. Mutable attributed string with the plain text
    let attrStr: NSMutableAttributedString = .init(string: string)
    
    // 2. Dictionary to store NSAdaptiveImageGlyph objects, indexed by their unique identifiers
    var images: [String: NSAdaptiveImageGlyph] = [:]
    
    // 3. Populate the dictionary with NSAdaptiveImageGlyph objects
    for (id, data) in imageData {
        // a. Create an NSAdaptiveImageGlyph for each image and store it using its identifier
        images[id] = NSAdaptiveImageGlyph(imageContent: data)
    }
    
    // 4. Iterate over the ranges and identifiers of the image glyphs
    for (range, id) in imageRanges {
        // a. Add the adaptive image glyph as an attribute to the appropriate range in the attributed string
        attrStr.addAttribute(.adaptiveImageGlyph, value: images[id]!, range: range)
    }
    
    // 5. Return the fully recomposed attributed string with embedded image glyphs
    return attrStr
}

Initialize a mutable attributed string with the plain text.
Create a dictionary to store NSAdaptiveImageGlyph objects, indexed by their unique identifiers.
Populate the dictionary with NSAdaptiveImageGlyph objects.
1. Create an NSAdaptiveImageGlyph for each image and store it using its identifier.
Iterate over the ranges and identifiers of the image glyphs.
1. Add the adaptive image glyph as an attribute to the appropriate range in the attributed string.
Return the fully recomposed attributed string with embedded image glyphs.

Let’s integrate it into our previous SwiftUI view.

import SwiftUI

struct ContentView: View {
	
    ...

    // Add a variable for the recomposed text
    @State var recomposed: NSAttributedString? = nil
    
    var body: some View {
    
        VStack(alignment: .leading) {
            // Custom Text Editor
            Section {
                ...
            } header: {
                Text("Custom Text Editor:")
                    .foregroundStyle(.gray)
            }
            
            Section {
                if let text = decomposed {
                    buildInlineTextView(decomposedAttributedString: text)

                    // New button to recompose the text
                    Button("Recompose") {
                        if decomposed != nil {
                            recomposed = recomposeAttributedString(
                                string: decomposed!.0,
                                imageRanges: decomposed!.1,
                                imageData: decomposed!.2
                            )
                        }
                    }
                    .buttonStyle(.bordered)

                    // Displaying the recomposed text
                    if recomposed != nil {
                        Divider()
                        Spacer()
                        Section {
                            CustomTextEditor(text: $recomposed, fontSize: fontSize)
                                .frame(height: 80)
                        } header: {
                            Text("Recomposed Attributed String:")
                                .foregroundStyle(.gray)
                        }
                    }
                    
                } else {
                    Text("No text decomposed to display")
                        .foregroundColor(.gray)
                }
                Spacer()
            } header: {
                Text("Decomposed Attributed String:")
                    .foregroundStyle(.gray)
            }
        }
        .padding()
    }
    
    // Add the recompose method in the view
    private func recomposeAttributedString(string: String, imageRanges: [(NSRange, String)], imageData: [String: Data]) -> NSAttributedString {...}
    
    private func decomposeAttributedString(_ attrStr: NSAttributedString?) -> (String, [(NSRange, String)], [String: Data])? {...}
    
    private func buildInlineTextView(decomposedAttributedString: (String, [(NSRange, String)], [String: Data])) -> some View {...}

    enum InlineComponent {...}

}

// Custom Text Editor
struct CustomTextEditor: UIViewRepresentable {...}

0:00

/0:51

In this SwiftUI view, when the user types anything in the custom text editor, its content will be decomposed and displayed as inline text embedding images. That same decomposition will be later recomposed and displayed as an attributed string.

By decomposing attributed strings into inline components and rebuilding them, you can seamlessly integrate images into SwiftUI views and editors supporting custom non-RTFD formats. This ensures adaptability, proper rendering, and preservation of embedded image glyphs, enabling versatile text input and display workflows in custom applications.

Displaying Genmoji in custom non-RTFD documents

Sign up for more like this.