Optional subpixel sampling for high-Q zoom

Ideas for improvements and requests for new features in XnView Classic

Moderators: XnTriq, helmut, xnview

stephan_g
Posts: 11
Joined: Thu May 31, 2007 10:22 pm

Optional subpixel sampling for high-Q zoom

Post by stephan_g »

This is a trick that some camera manufacturers use to squeeze more resolution out of their displays (where 320x240 is considered high-res); the infamous MS ClearType also employs it. It requires that the subpixel structure of the display (monitor) be known, since just that is used (something a user can find out with a loupe, on my trusty Samsung 191T it's the classic horizontal R-G-B pattern, not sure whether PC monitors with triangular patterns also exist, but I think B-G-R and other permutations are used).

Normally one assumes that the monitor displays all-in-one pixels responsible for all of R, G and B. When using subpixel sampling, however, the R, G and B channels of an interpolated image are sampled slightly shifted according to the subpixel structure.

Imagine this monochrome 3x1 "image":

Code: Select all

X . #
along with this kind of 1x1 "monitor":

Code: Select all

R G B
Normally the output might look something like this:

Code: Select all

U U U
With subpixel sampling, the result could be:

Code: Select all

X . #
It is obvious that the luminance resolution has been increased, but also that if we're unlucky, artifacts may be introduced, such as one bright white car light strongly zoomed out becoming one red subpixel only. (It therefore is something like the inverse of Bayer interpolation.)

I am not quite sure how that goes together with our beloved sampling theorem - as with the normal high-Q display, this is an undersampling problem requiring lowpass filtering, but this time with R, G and B shifted. OK, maybe I've got it. It is obvious that in the lowpass-filtered image (a step I left out in my example, which therefore wasn't too well chosen), a scenario like an object occupying only one subpixel must not occur, as in our R-G-B scenario, this would mean a frequency of 3/2 fs (and harmonics) horizontally! Therefore the possible increase in luminance resolution is likely to be limited, but a smoother image display should be possible nonetheless, and isn't quality always an argument?

Wouldn't this be worth a try, or what do you think?
User avatar
foxyshadis
Posts: 394
Joined: Sat Nov 18, 2006 8:57 am

Post by foxyshadis »

Experimental resizing is one of my favorite hobbies, so I figured, why not give it a shot? Initial results can be summed up as: Not useful at all for upsizing, actually looks worse with naive subpixel method, arches & strokes look smoother with a better method (and as sharp as bicubic), gradients and photos are either worse or minimally better.

Steps to test (naive method):
* Open in photoshop
* Resize to (3x final width)x(final height)
* In channel palette, split channels
* Crop 1px off left of green, 2px of left of blue
* Resize all to final width with Nearest Neighbor (aka Point)
* Merge channels, compare with other methods.

I used bilinear resize, because that's what xnview currently uses, and doesn't appear to be moving to anything else anytime soon.

By not cropping green, instead bilinear sizing down to final size, green is prevented from being too bright around other dim pixels, virtually eliminating color fringing while retaining sharpness. By tweaking the exact ratio of the channel weights, you could trade more definition for a little more fringing, but that would require putting together a matlab or C++ project, and I'm not that interested yet.

The overall effect for b&w documents is that it's as sharp as bicubic. In continuous tone images it looks slightly off, but you can only tell by flipping rapidly back and forth, and it can cause severe color fringing on colored lettering/backgrounds. It only works for high-contrast areas, low-saturation areas, like letters. Good for documents and comics, then.

By using this in combination with other resizing methods, such as 3-4 tap lanczos or 3-tap spline, you can achieve slightly better definition than with the resize alone (and much better than bilinear, as you'd expect).

The downside is that it's much slower. On average it'd be around 1.5-2x the time to resize large images to 3x width then to single width. It'd be better to save that time to perform a sharper, higher definition resize that benefits all images, like lanczos.

However, there is a way to do everything in one step. It requires a little more math and a more advanced resize engine than I'm guessing xnview currently uses - subpixel cropping/shifting during resize. Avisynth is one of the few engines with this level of functionality, and I'm going to experiment with it later to see how well it works.