<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Santosh Mahto]]></title><description><![CDATA[I enjoy diving deep and finding practical, high-performance solutions. Working with media streaming &amp; many more]]></description><link>https://blog.insantoshmahto.dev</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1751211442028/222b63b3-18e1-4341-a3a7-2837600eb0bd.png</url><title>Santosh Mahto</title><link>https://blog.insantoshmahto.dev</link></image><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 01:41:16 GMT</lastBuildDate><atom:link href="https://blog.insantoshmahto.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[🎨 From Pixels to Performance: Mastering Image Matrices, Compression & GPU Acceleration]]></title><description><![CDATA[Whether you're building a graphics editor, optimizing images for the web, or preprocessing data for machine learning, understanding how images are stored, compressed, and processed is essential. This blog dives deep from the basics of image matrices ...]]></description><link>https://blog.insantoshmahto.dev/from-pixels-to-performance-mastering-image-matrices-compression-and-gpu-acceleration</link><guid isPermaLink="true">https://blog.insantoshmahto.dev/from-pixels-to-performance-mastering-image-matrices-compression-and-gpu-acceleration</guid><category><![CDATA[image processing]]></category><category><![CDATA[GPU]]></category><category><![CDATA[Swift]]></category><category><![CDATA[webp]]></category><category><![CDATA[PNG]]></category><category><![CDATA[jpeg]]></category><category><![CDATA[apple silicon]]></category><category><![CDATA[C++]]></category><category><![CDATA[Python]]></category><category><![CDATA[bmp]]></category><dc:creator><![CDATA[Santosh Mahto]]></dc:creator><pubDate>Sun, 29 Jun 2025 14:45:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1751207608373/f50582a8-dbe4-47d1-86e9-9acf2b305a57.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Whether you're building a graphics editor, optimizing images for the web, or preprocessing data for machine learning, understanding how images are stored, compressed, and processed is essential. This blog dives deep from the basics of image matrices to advanced GPU-powered grayscale conversion using both Apple Silicon and NVIDIA CUDA.</p>
<hr />
<h2 id="heading-key-terms-for-beginners">📚 Key Terms for Beginners</h2>
<h3 id="heading-what-is-an-image-matrix">What Is an Image Matrix?</h3>
<p>An image is essentially a matrix (array) of pixel values:</p>
<ul>
<li><p><strong>Grayscale</strong>: <code>H x W</code> (1 channel: intensity)</p>
</li>
<li><p><strong>RGB</strong>: <code>H x W x 3</code> (Red, Green, Blue)</p>
</li>
<li><p><strong>RGBA</strong>: <code>H x W x 4</code> (RGB + Alpha for transparency)</p>
</li>
</ul>
<h3 id="heading-what-is-compression">What Is Compression?</h3>
<p>Compression reduces file size by removing redundant or less important data.</p>
<ul>
<li><p><strong>Lossless</strong>: No data is lost. You can recover the exact original.</p>
</li>
<li><p><strong>Lossy</strong>: Some data is discarded, prioritizing visual similarity over exact reconstruction.</p>
</li>
</ul>
<h3 id="heading-what-is-transparency-alpha-channel">What Is Transparency (Alpha Channel)?</h3>
<p>An alpha channel defines pixel transparency. 0 is fully transparent, 255 is fully opaque.</p>
<hr />
<h2 id="heading-understanding-image-formats">💡 Understanding Image Formats</h2>
<h3 id="heading-bmp-bitmap">BMP (Bitmap)</h3>
<ul>
<li><p>Raw pixel data, no compression</p>
</li>
<li><p>Very large file size</p>
</li>
</ul>
<h3 id="heading-png-portable-network-graphics">PNG (Portable Network Graphics)</h3>
<ul>
<li><p><strong>Lossless</strong> compression using DEFLATE (zlib)</p>
</li>
<li><p>Supports transparency</p>
</li>
</ul>
<h3 id="heading-jpeg-joint-photographic-experts-group">JPEG (<strong>Joint Photographic Experts Group</strong>)</h3>
<ul>
<li><p><strong>Lossy</strong> compression using DCT (Discrete Cosine Transform)</p>
</li>
<li><p>Ideal for photos, not UIs or text</p>
</li>
</ul>
<h3 id="heading-webp-by-google">WebP (by Google)</h3>
<ul>
<li><p>Supports both <strong>lossy and lossless</strong> compression</p>
</li>
<li><p>Supports transparency</p>
</li>
<li><p>Modern, web-optimized</p>
</li>
</ul>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Format</td><td>Lossless</td><td>Lossy</td><td>Alpha</td><td>Use Case</td></tr>
</thead>
<tbody>
<tr>
<td>BMP</td><td>✅</td><td>❌</td><td>❌</td><td>Raw data, internal use</td></tr>
<tr>
<td>PNG</td><td>✅</td><td>❌</td><td>✅</td><td>UI, icons, screenshots</td></tr>
<tr>
<td>JPG</td><td>❌</td><td>✅</td><td>❌</td><td>Photos</td></tr>
<tr>
<td>WebP (lossy)</td><td>❌</td><td>✅</td><td>✅</td><td>Photos with transparency</td></tr>
<tr>
<td>WebP (lossless)</td><td>✅</td><td>❌</td><td>✅</td><td>Replacement for PNG</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-matrix-vs-file-size-example-250x250-rgba">📊 Matrix vs File Size Example (250x250 RGBA)</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Format</td><td>Raw Matrix Size</td><td>Compressed Size (Approx)</td></tr>
</thead>
<tbody>
<tr>
<td>BMP</td><td>250 KB</td><td>~250 KB</td></tr>
<tr>
<td>PNG</td><td>250 KB</td><td>~0.5–0.8 KB</td></tr>
<tr>
<td>JPG</td><td>250 KB</td><td>~2 KB</td></tr>
<tr>
<td>WebP Lossy</td><td>250 KB</td><td>~1–2 KB</td></tr>
<tr>
<td>WebP Lossless</td><td>250 KB</td><td>~0.3–0.5 KB</td></tr>
</tbody>
</table>
</div><p><img src="https://upload.wikimedia.org/wikipedia/commons/7/73/JPEG_example_compression_ratio_comparison.png" alt="Image Compression Comparison" /></p>
<hr />
<h2 id="heading-grayscale-conversion-concept">📷 Grayscale Conversion: Concept</h2>
<p>To convert RGB or RGBA to grayscale:</p>
<pre><code class="lang-text">Gray = 0.299*R + 0.587*G + 0.114*B
</code></pre>
<p>If there's an alpha channel, we usually preserve it.</p>
<hr />
<h2 id="heading-python-example-grayscale-with-pil-numpy">📄 Python Example: Grayscale with PIL + NumPy</h2>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> PIL <span class="hljs-keyword">import</span> Image
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

img = Image.open(<span class="hljs-string">"input.png"</span>).convert(<span class="hljs-string">"RGBA"</span>)
data = np.array(img)

<span class="hljs-comment"># Extract RGB channels</span>
r, g, b, a = data[:,:,<span class="hljs-number">0</span>], data[:,:,<span class="hljs-number">1</span>], data[:,:,<span class="hljs-number">2</span>], data[:,:,<span class="hljs-number">3</span>]
gray = (<span class="hljs-number">0.299</span> * r + <span class="hljs-number">0.587</span> * g + <span class="hljs-number">0.114</span> * b).astype(np.uint8)

<span class="hljs-comment"># Combine grayscale with alpha</span>
result = np.stack((gray, gray, gray, a), axis=<span class="hljs-number">-1</span>)
Image.fromarray(result, mode=<span class="hljs-string">"RGBA"</span>).save(<span class="hljs-string">"gray_output.png"</span>)
</code></pre>
<hr />
<h2 id="heading-apple-silicon-gpu-core-image-grayscale-example">🌟 Apple Silicon GPU: Core Image Grayscale Example</h2>
<pre><code class="lang-swift"><span class="hljs-keyword">import</span> Foundation
<span class="hljs-keyword">import</span> CoreImage
<span class="hljs-keyword">import</span> AppKit  <span class="hljs-comment">// For macOS</span>

<span class="hljs-keyword">let</span> input = <span class="hljs-string">"/path/to/input.png"</span>
<span class="hljs-keyword">let</span> output = <span class="hljs-string">"/path/to/output.png"</span>
<span class="hljs-keyword">let</span> ciImage = <span class="hljs-type">CIImage</span>(contentsOf: <span class="hljs-type">URL</span>(fileURLWithPath: input))!

<span class="hljs-keyword">let</span> <span class="hljs-built_in">filter</span> = <span class="hljs-type">CIFilter</span>.photoEffectMono()
<span class="hljs-built_in">filter</span>.inputImage = ciImage
<span class="hljs-keyword">let</span> outputCI = <span class="hljs-built_in">filter</span>.outputImage!

<span class="hljs-keyword">let</span> context = <span class="hljs-type">CIContext</span>(options: [.useSoftwareRenderer: <span class="hljs-literal">false</span>])
<span class="hljs-keyword">let</span> cgImage = context.createCGImage(outputCI, from: outputCI.extent)!
<span class="hljs-keyword">let</span> nsImage = <span class="hljs-type">NSImage</span>(cgImage: cgImage, size: .zero)

<span class="hljs-keyword">let</span> rep = <span class="hljs-type">NSBitmapImageRep</span>(cgImage: cgImage)
<span class="hljs-keyword">let</span> pngData = rep.representation(using: .png, properties: [:])
<span class="hljs-keyword">try</span>! pngData?.write(to: <span class="hljs-type">URL</span>(fileURLWithPath: output))
</code></pre>
<ul>
<li><p>Uses GPU under the hood (Metal)</p>
</li>
<li><p>Transparent PNG supported</p>
</li>
</ul>
<hr />
<h2 id="heading-nvidia-cuda-example-grayscale-in-c">⚙️ NVIDIA CUDA Example: Grayscale in C++</h2>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;cuda_runtime.h&gt;</span></span>
<span class="hljs-function">__global__ <span class="hljs-keyword">void</span> <span class="hljs-title">rgbToGray</span><span class="hljs-params">(<span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>* in, <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>* out, <span class="hljs-keyword">int</span> w, <span class="hljs-keyword">int</span> h)</span> </span>{
    <span class="hljs-keyword">int</span> x = blockIdx.x * blockDim.x + threadIdx.x;
    <span class="hljs-keyword">int</span> y = blockIdx.y * blockDim.y + threadIdx.y;
    <span class="hljs-keyword">int</span> idx = (y * w + x) * <span class="hljs-number">3</span>;
    <span class="hljs-keyword">if</span> (x &lt; w &amp;&amp; y &lt; h) {
        <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span> r = in[idx];
        <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span> g = in[idx + <span class="hljs-number">1</span>];
        <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span> b = in[idx + <span class="hljs-number">2</span>];
        out[y * w + x] = <span class="hljs-number">0.299f</span> * r + <span class="hljs-number">0.587f</span> * g + <span class="hljs-number">0.114f</span> * b;
    }
}

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">()</span> </span>{
    <span class="hljs-keyword">int</span> w = <span class="hljs-number">250</span>, h = <span class="hljs-number">250</span>;
    <span class="hljs-keyword">int</span> imgSize = w * h * <span class="hljs-number">3</span>;
    <span class="hljs-keyword">int</span> graySize = w * h;

    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>* h_in = <span class="hljs-keyword">new</span> <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>[imgSize];
    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>* h_out = <span class="hljs-keyword">new</span> <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span>[graySize];
    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i &lt; imgSize; i += <span class="hljs-number">3</span>) h_in[i] = <span class="hljs-number">255</span>, h_in[i+<span class="hljs-number">1</span>] = <span class="hljs-number">0</span>, h_in[i+<span class="hljs-number">2</span>] = <span class="hljs-number">0</span>;

    <span class="hljs-keyword">unsigned</span> <span class="hljs-keyword">char</span> *d_in, *d_out;
    cudaMalloc(&amp;d_in, imgSize);
    cudaMalloc(&amp;d_out, graySize);
    cudaMemcpy(d_in, h_in, imgSize, cudaMemcpyHostToDevice);

    <span class="hljs-function">dim3 <span class="hljs-title">block</span><span class="hljs-params">(<span class="hljs-number">16</span>, <span class="hljs-number">16</span>)</span></span>;
    <span class="hljs-function">dim3 <span class="hljs-title">grid</span><span class="hljs-params">((w+<span class="hljs-number">15</span>)/<span class="hljs-number">16</span>, (h+<span class="hljs-number">15</span>)/<span class="hljs-number">16</span>)</span></span>;
    rgbToGray&lt;&lt;&lt;grid, block&gt;&gt;&gt;(d_in, d_out, w, h);
    cudaMemcpy(h_out, d_out, graySize, cudaMemcpyDeviceToHost);

    cudaFree(d_in); cudaFree(d_out);
    <span class="hljs-keyword">delete</span>[] h_in; <span class="hljs-keyword">delete</span>[] h_out;
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
}
</code></pre>
<ul>
<li><p>Requires NVIDIA GPU + CUDA</p>
</li>
<li><p>Ideal for massive parallel image/data processing</p>
</li>
</ul>
<hr />
<h2 id="heading-benchmarks-approximate">🚀 Benchmarks (Approximate)</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Method</td><td>Input Size</td><td>Execution Time</td><td>GPU Utilization</td></tr>
</thead>
<tbody>
<tr>
<td>PIL (CPU, Python)</td><td>250x250</td><td>~12 ms</td><td>❌</td></tr>
<tr>
<td>Core Image (Mac)</td><td>250x250</td><td>~2–3 ms</td><td>✅</td></tr>
<tr>
<td>CUDA (NVIDIA)</td><td>250x250</td><td>~0.5–1 ms</td><td>✅ ✅</td></tr>
</tbody>
</table>
</div><p><em>Note: Real benchmarks vary based on hardware.</em></p>
<hr />
<h2 id="heading-when-to-use-what">🧭 When to Use What?</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Use Case</td><td>Best Format / Method</td></tr>
</thead>
<tbody>
<tr>
<td>Web UI, icons</td><td>PNG / WebP Lossless</td></tr>
<tr>
<td>Photography</td><td>JPG / WebP Lossy</td></tr>
<tr>
<td>Transparent graphics</td><td>PNG / WebP</td></tr>
<tr>
<td>ML input pipelines</td><td>PNG / BMP (exact pixels)</td></tr>
<tr>
<td>GPU image filters</td><td>Apple Core Image / CUDA</td></tr>
</tbody>
</table>
</div><hr />
<h2 id="heading-final-thoughts">🧠 Final Thoughts</h2>
<ul>
<li><p>All images are just matrices</p>
</li>
<li><p>Choosing between <strong>lossy</strong> and <strong>lossless</strong> depends on your use case</p>
</li>
<li><p>Use <strong>GPU</strong> (Apple Silicon / CUDA) for performance-heavy image processing</p>
</li>
<li><p>Use <strong>PNG/WebP</strong> when transparency or precision matters</p>
</li>
</ul>
]]></content:encoded></item></channel></rss>