The gradient that changed after publishing: reading color from 16-bit screenshots on iOS

Not so long ago it turned out that for some of our dear users the gradient drawn behind a picture in a story looked one way before they published the story, and a different way after. Clearly something was wrong somewhere — but where?

As always, the biggest problem is that on the developers’ machines everything works, the QA team can’t reproduce it, and users rarely tell you what exactly they were doing. Still, thanks to one of those users we managed to narrow it down: the problem only happened with screenshots. All that was left was to figure out what was actually wrong with them.

The simplest guess: a screenshot contains more distinct colors than an ordinary image.

A little background on how images store color

Every image is, at the end of the day, a buffer of bytes. Each color is made of 3 or 4 components — red, green, blue and sometimes alpha (transparency): RGBA. Usually each component is encoded with 8 bits, which gives us 256 levels per channel (2⁸). But 256 levels isn’t actually a very wide range, so at some point people started using 16 bits per channel instead of 8 — and that gives 65 536 levels per color, which is a lot more. And that’s exactly what screenshots on iPhones are.

Life hadn’t prepared us for that 🙂 — our color extraction only supported the first variant (8 bits per channel).

Here’s roughly the feature we’re talking about. To build the gradient behind a picture in a story, we take a strip from the top and a strip from the bottom of the image, average the color of each strip, and use those two colors as the gradient stops:

public func getGradientColors(segmentHeightCoef: Double, _ completion: (UIColor, UIColor) -> Void) {    var segmentHeight = round(size.height * segmentHeightCoef)    if segmentHeight == 0 { segmentHeight = 1 }    guard let topStrip = cropImage(x: 0, y: 0, width: size.width, height: segmentHeight),          let bottomStrip = cropImage(x: 0, y: size.height - segmentHeight, width: size.width, height: segmentHeight)    else { return }    let firstColor = topStrip.getAverageColorWithPtr()    let secondColor = bottomStrip.getAverageColorWithPtr()    completion(firstColor, secondColor)}

So the whole story comes down to one function: “give me the average color of these pixels.” And that’s where 16-bit screenshots quietly broke everything.

The problem

We have a buffer of bytes, and every channel value is two adjacent bytes: rr gg bb rr gg bb .... What we want is a sequence of channel values: r g b r g b .... In other words, we have an array of UInt8 and we want an array of UInt16.

The naive idea is to just tell the compiler “treat these bytes as 16-bit numbers” and be done with it. It would be wonderful, if not for one little thing: it doesn’t work.

Anyone familiar with the PNG/bitmap layout probably already knows what the catch is, but not everyone here is a specialist, so let me spell it out. Since our release was on fire and we had to ship something, the quick fix was to fall back, in these cases, to Apple’s own machinery — which is somewhat slower, but reliably correct:

public func getAverageColor(context: CIContext) -> UIColor {    guard let ciImage = CIImage(image: self) else { return .clear }    let extent = ciImage.extent    let vector = CIVector(x: extent.origin.x, y: extent.origin.y,                          z: extent.size.width, w: extent.size.height)    guard let filter = CIFilter(name: "CIAreaAverage",                                parameters: [kCIInputImageKey: ciImage, kCIInputExtentKey: vector]),          let outputImage = filter.outputImage    else { return .clear }    var bitmap = [UInt8](repeating: 0, count: 4)    context.render(outputImage,                   toBitmap: &bitmap,                   rowBytes: 4,                   bounds: CGRect(x: 0, y: 0, width: 1, height: 1),                   format: .RGBA8,                   colorSpace: CGColorSpaceCreateDeviceRGB())    return UIColor(red: CGFloat(bitmap[0]) / 255.0,                   green: CGFloat(bitmap[1]) / 255.0,                   blue: CGFloat(bitmap[2]) / 255.0,                   alpha: CGFloat(bitmap[3]) / 255.0)}

CIAreaAverage reduces the whole region to a single pixel on the GPU, and we just read it back. Correct, but it spins up a CIContext and a Core Image filter every time — overkill for “average a thin strip of pixels.”

Difficulties have never stopped me, though, so once the fire was out I came back to this in my spare time. Let’s dig in.

Endianness, or: which way do we read the bytes?

A 16-bit number can be written as two consecutive bytes — say [123, 456] — but it can equally be written as [456, 123]. It depends on the convention you agree on, the same way text can be read left-to-right or right-to-left. These two conventions are called big-endian and little-endian.

And here’s the kicker: you don’t actually control which one you get. Once an image is decoded into a CGImage, the byte order of the in-memory bitmap depends on the source and the platform — and it is not guaranteed to be the one you assume. On iOS the 16-bit screenshots come back little-endian, but, as it turned out, you can’t just hard-code that. Luckily Core Graphics tells you the truth via cgImage.byteOrderInfo, so the right move is to read that flag and handle both cases.

That’s the only thing left to do, really: turn each pair of bytes into one number, taking into account that the pair may be reversed (and we know the order in advance). The reconstruction is just:

// big-endian: most significant byte firstlet value = data[i] << 8 + data[i + 1]// little-endian: least significant byte firstlet value = data[i + 1] << 8 + data[i]

We shift the most significant byte 8 bits to the left and add the least significant one — and we get a single 16-bit value. You can use | instead of +; here they’re equivalent, and << binds tighter than both, so no extra parentheses are needed.

One nuance worth a word, because it bites people: do not shift the UInt8 directly. someUInt8 << 8 is just 0 — there are no bits left in a byte to hold the result. You have to widen first:

func channel(high: UInt8, low: UInt8) -> UInt32 {    UInt32(high) << 8 + UInt32(low)   // widen to UInt32 BEFORE shifting}

Putting it together

There are two more wrinkles besides byte order.

Component layout. The channels don’t always come in RGBA order — you might get BGRA, ABGR, ARGB and so on. Core Graphics reports this via bitmapInfo.componentLayout. The layouts where alpha comes first (argb, abgr) mean we have to skip the alpha component before reading the color channels. “Skip one component” is bitsPerComponent / 8 bytes — i.e. 2 bytes for a 16-bit image, 1 byte for an 8-bit one — which is exactly our offset:

let offset: Intswitch componentLayout {    case .abgr, .argb:        offset = cgImage.bitsPerComponent / 8   // skip the leading alpha    case .bgr, .rgba, .rgb, .bgra:        offset = 0}

Stride. We walk the buffer one pixel at a time. One pixel is bitsPerPixel / 8 bytes, so we step by bytesPerPixel and accumulate the channel sums:

guard let cgImage = cgImage,      let cfData = cgImage.dataProvider?.data,      let componentLayout = cgImage.bitmapInfo.componentLayout,      let dataPtr = CFDataGetBytePtr(cfData)else { return .clear }if cgImage.bitsPerComponent == 16 {    let bytesPerPixel = cgImage.bitsPerPixel / 8    let totalPixelCount = cgImage.width * cgImage.height    var redSum: UInt32 = 0    var greenSum: UInt32 = 0    var blueSum: UInt32 = 0    // combine two bytes into one 16-bit channel value    func channel(high: UInt8, low: UInt8) -> UInt32 {        UInt32(high) << 8 + UInt32(low)    }    for i in stride(from: 0, to: totalPixelCount * bytesPerPixel, by: bytesPerPixel) {        let red: UInt32        let green: UInt32        let blue: UInt32        if cgImage.byteOrderInfo == .order16Little {            // least significant byte first: high byte is the SECOND one            switch componentLayout {                case .bgra, .bgr, .abgr:                    blue  = channel(high: dataPtr[i + 1 + offset], low: dataPtr[i + offset])                    green = channel(high: dataPtr[i + 3 + offset], low: dataPtr[i + 2 + offset])                    red   = channel(high: dataPtr[i + 5 + offset], low: dataPtr[i + 4 + offset])                case .rgba, .rgb, .argb:                    red   = channel(high: dataPtr[i + 1 + offset], low: dataPtr[i + offset])                    green = channel(high: dataPtr[i + 3 + offset], low: dataPtr[i + 2 + offset])                    blue  = channel(high: dataPtr[i + 5 + offset], low: dataPtr[i + 4 + offset])            }        } else {            // most significant byte first: high byte is the FIRST one            switch componentLayout {                case .bgra, .bgr, .abgr:                    blue  = channel(high: dataPtr[i + offset],     low: dataPtr[i + 1 + offset])                    green = channel(high: dataPtr[i + 2 + offset], low: dataPtr[i + 3 + offset])                    red   = channel(high: dataPtr[i + 4 + offset], low: dataPtr[i + 5 + offset])                case .rgba, .rgb, .argb:                    red   = channel(high: dataPtr[i + offset],     low: dataPtr[i + 1 + offset])                    green = channel(high: dataPtr[i + 2 + offset], low: dataPtr[i + 3 + offset])                    blue  = channel(high: dataPtr[i + 4 + offset], low: dataPtr[i + 5 + offset])            }        }        redSum += red        greenSum += green        blueSum += blue    }    let count = UInt32(totalPixelCount)    return UIColor(red:   CGFloat(redSum   / count) / 65535.0,                   green: CGFloat(greenSum / count) / 65535.0,                   blue:  CGFloat(blueSum  / count) / 65535.0,                   alpha: 1.0)}

The only difference between the two branches is which of the two bytes is the high one — that’s the whole “read it left-to-right vs right-to-left” idea, made concrete. We divide each sum by the pixel count and normalize by 65535.0 (that’s 2¹⁶ − 1, the maximum 16-bit value) to land back in UIColor’s 0...1 range. The 8-bit path is the same loop with single bytes and a 255.0 divisor.

And that was basically the end of it — everything started working 🙂

Wrap-up

The bug looked spooky (“the color changes after publishing!”), but underneath it was the most basic thing in the world: two bytes that we were reading as if they were one. The fix is three facts you have to respect — bits per component, byte order, and component layout — none of which you can assume; you ask CGImage and branch on the answer.

Don’t be afraid to take on new or unfamiliar problems, look for clues, build new things — it brings a lot of satisfaction from what you do.

ссылка на оригинал статьи https://habr.com/ru/articles/1049656/