• See also:

    A Pixel Is Not a Little Square (1995) [pdf] – http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf

    • This is written from a rather narrow perspective (of signal processing) and is clearly wrong in other contexts. For an image sensor designer, gate length, photosensitive area and pixel pitch are all real-valued measurements. That pixels are laid out on a grid simply reflects the ease of constructing electronic circuits this way. Alternative arrangements are possible, Sigma's depth pixels being one commercial example.
      • Ok, but that’s not what an digital image is. Images are designed to be invariant across camera capture and display hardware. The panel driver should interpret the dsp representation into an appropriate electronic pixel output.
        • That's only for images coming directly from a camera. If the images were generated in another way, the idea that a pixel is a little square is sometimes ok (example, pixel art)
        • Yeah but the article is about a pixel, which has different meanings. Making blanket statements is not helpful in resolving definitions.

          Truth is, a pixel is both a sample and a transducer. And in transduction, a pixel is both an integrator and an emitter.

          • I’ll quote my other comment:

            > If you are looking to understand how your operating system will display images, or how your graphics drivers work, or how photoshop will edit them, or what digital cameras aim to produce, then it’s the point sample definition.

            • tobr
              Sometimes yes. Sometimes no! There are certainly situations where a pixel will be scaled, displayed, edited or otherwise treated as a little square.
              • A little rectangle even.
                • What is a little rectangle? Physical display pixels are separate RGB leds with a non-rectangular shape.
                  • In medical imaging, data are often acquired using anisotropic resolution. So a pixel (or voxel in 3D) can be an averaged signal sample originating from 2mm of tissue in one direction and 0.9mm in another direction.
                  • BBC Micro in graphics mode 2 the pixel is a 2:1 rectangle. Other geometries for different modes. It's going back a bit, but very much living memory!

                    https://beebwiki.mdfs.net/MODE_2

                  • They're rectangles on my monitor.
                    • Do the 3 separate sub pixels look like a photoshop pixel?
                      • I'm not sure what you mean by Photoshop pixel.

                        They also look more like a square when I back away. And the mismatch of the square model doesn't mean the point model is good.

                        • > And the mismatch of the square model

                          So your intuition for why squares makes sense is wrong, but you’re still holding on to it.

                          > doesn't mean the point model is good.

                          What does show it’s a good model is all the theory of image processing and the implementation of this theory in camera display systems.

                          You’re welcome to propose an alternative theory, and if that is consistent, try to get manufacturers to adopt it.

                          • > So your intuition for why squares makes sense is wrong, but you’re still holding on to it.

                            I said subpixels are rectangles. Because they are.

                            If the point model was all you need, then objects small enough to slip between points would be invisible. Which is not the case.

                            In particular a shot of the night sky would look pure black.

                            So if being wrong means we should abandon the model, then we can't use squares or points.

                            • > I said subpixels are rectangles. Because they are

                              https://upload.wikimedia.org/wikipedia/commons/4/4d/Pixel_ge...

                              I look forward to your paper about a superior digital image representation.

                              • So is this a cheap gotcha because I only said "my monitor" and "most screens" the first couple times and didn't repeat it a third time? It's the one labeled just "LCD".

                                Or are you arguing that the slightly rounded corners on the rectangles make a significant difference in how the filtering math works out? It doesn't. On a scale between "gaussian" and "perfect rectangles", the filtering for this shape is 95% toward the latter.

                    • If black and white or some other cases, but there are typically subpixels that can make things like sparse enough text stroke transitions get 3X horizontal resolution.
                      • The subpixels are rectangles on most screens.
                        • They tend to have rounded or cut corners and are not uniform.
        • We commonly use hardware like LCDs and printers that render a sharp transition between pixels without the Gibbs' phenomenon. CRT scanlines were close to an actual 1D signal (but not directly controlled by the pixels, which the video cards still tried to make square-ish), but AFAIK we've never had a display that is a 2D signal that we assume in image processing.

          In signal processing you have a finite number of samples of an infinitely precise contiguous signal, but in image processing you have a discrete representation mapped to a discrete output. It's contiguous only when you choose to model it that way. Discrete → contiguous → discrete conversion is a useful tool in some cases, but it's not the whole story.

          There are images designed for very specific hardware, like sprites for CRT monitors, or font glyphs rendered for LCD subpixels. More generally, nearly all bitmap graphics assumes that pixel alignment is meaningful (and that has been true even in the CRT era before the pixel grid could be aligned with the display's subpixels). Boxes and line widths, especially in GUIs, tend to be designed for integer multiples of pixels. Fonts have/had hinting for aligning to the pixel grid.

          Lack of grid alignment, an equivalent of a phase shift that wouldn't matter in pure signal processing, is visually quite noticeable at resolutions where the hardware pixels are little squares to the naked eye.

          • I think you are saying there are other kinds of displays which are not typical monitors and those displays show different kinds of images - and I don’t disagree.
        • Well, "what a digital image is" is a sequence of numbers. There's no single correct way to interpret the numbers, it depends on what you want to accomplish. If your digital image is a representation of, say, the dead components in an array of sensors, the signal processing theoretic interpretation of samples may not be useful as far as figuring out which sensors you should replace.
          • > There's no single correct way to interpret the numbers

            They are just bits in a computer. But there is a correct way of to interpret them in a particular context. For example 32 bits can be meaningless - or it can have an interpretation as a twos complement integer which is well defined.

            If you are looking to understand how an operating system will display images, or how graphics drivers work, or how photoshop will edit them, or what digital cameras produce, then it’s the point sample definition.

            • Cameras don't take point samples. That's an approximation, just as inaccurate as a rectangle approximation.

              And for pixel art, the intent is usually far from points on a smooth color territory.

              Multiple interpretations matter within different contexts inside the computer context.

              • > Cameras don't take point samples. That's an approximation

                They use a physical process to attempt to determine light at a single point. That’s their model they try to approximate.

                > And for pixel art, the intent is usually far from points on a smooth color territory.

                And notice that to display pixel art you need to tell it to interpret the image data differently.

                Also it has a vastly different appearance on a CRT where it was designed which is less like a rectangle.

                • > They use a physical process to attempt to determine light at a single point. That’s their model they try to approximate.

                  According to who?

                  A naked camera sensor with lens sure doesn't do that, it collects squares of light, usually in a mosaic of different colors. Any point approximation would have to be in software.

                  • Yep, cameras filter the input from sensors.

                    > Any point approximation would have to be in software.

                    Circuits can process signals too.

                    • They usually do, but their software decisions are not gospel. They don't change the nature of the underlying sensor, which grabs areas that are pretty square.
                      • > which grabs areas

                        And outputs what? Just because the input is an area does not mean the output is an area.

                        What it if it outputs the peak of the distribution across the area?

                        > that are pretty square.

                        If we look at a camera sensor and do not see a uniform grid of packed area elements would that convince you?

                        I notice you haven’t shared any criticism of the point model - widely understood by the field.

                        • > And outputs what? Just because the input is an area does not mean the output is an area.

                          > What it if it outputs the peak of the distribution across the area?

                          It outputs a voltage proportional to the (filtered) photon count across the entire area.

                          > If we look at a camera sensor and do not see a uniform grid of packed area elements would that convince you?

                          Non-uniformity won't convince me points are a better fit, but if the median camera doesn't use a grid I'll be interested in what you have to show.

                          > I notice you haven’t shared any criticism of the point model - widely understood by the field.

                          This whole comment line is a criticism of the input being modeled as points, and my criticism of the output is implied by my pixel art comment above (because point-like upscaling causes a giant blur) and also exists in other comments like this one: https://news.ycombinator.com/item?id=43777957

                          • > It outputs a voltage proportional to the (filtered) photon count across the entire area.

                            This is not true. And it’s even debunked in the original article.

                            • > And it’s even debunked in the original article.

                              No, it's not. That article does not mention digital cameras anywhere. It briefly says that scanners give a gaussian, and I don't want to do enough research to see how accurate that is, but that's the only input device that gets detailed.

                              It also gives the impression that computer rendering uses boxes, when usually it's the opposite and rendering uses points.

        • Well the camera sensor captures a greater dynamic range than the display or print media or perhaps even your eyes, so something has to give. If you ever worked with a linear file without gamma correction you will understand what I mean.
          • And that full dynamic range is in the images’s point samples, ready to be remapped for a physical output.
    • If you want a good example of what happens when you treat pixels like they're just little squares, disable font smoothing. Anti-aliasing, fonts that look good, and smooth animation are all dependent upon subpixel rending.

      https://en.wikipedia.org/wiki/Subpixel_rendering

      Edit: For the record, I'm on Win 10 with a 1440p monitor and disabling font smoothing makes a very noticeable difference.

      People are acting like this is some issue that no longer exists, and you don't have to be concerned with subpixel rendering anymore. That's not true, and highlights a bias that's very prevalent here on HN. Just because I have a fancy retina display doesn't mean the average user does. If you pretend like subpixel rendering is no longer a concern, you can run into situations where fonts look great on your end, but an ugly jagged mess for your average user.

      And you can tell who the Apple users are because they believe all this went away years ago.

      • This might have been a good example fifteen years ago. These days with high-DPI displays you can't perceive a difference between font smoothing being turned on and off. On macOS for example font smoothing adds some faux bold to the fonts, and it's long been recommended to turn it off. See for example the influential article https://tonsky.me/blog/monitors/ which explains that font smoothing used to do subpixel antialiasing, but the whole feature was removed in 2018. It also explains that this checkbox doesn't even control regular grayscale antialiasing, and I'm guessing it's because downscaling a rendered @2x framebuffer down to the physical resolution inherently introduces antialiasing.
        • Maybe true for Mac users, but the average Win 10 desktop is still using a 1080p monitor.
          • The users may have 1080p monitors, but even Windows does not do subpixel antialiasing in its new apps (UWP/WinUI) anymore. On Linux, GTK4 does not do subpixel antialiasing anymore.

            The reason is mostly that it is too hard to make it work under transformations and compositing, while higher resolution screens are a better solution for anyone who cares enough.

            • > even Windows does not do subpixel antialiasing in its new apps (UWP/WinUI) anymore

              This is a little misleading, as the new versions of Edge and Windows Terminal do use subpixel antialiasing.

              What Microsoft did was remove the feature on a system level, and leave implementation up to individual apps.

            • Is this why font rendering on my Win11 laptop looks like shit on my 1440p external monitor?

              Laptop screen is 4k with 200% scaling.

              Seriously the font rendering in certain areas (i.e. windows notification panel) is actually dogshit. If I turn off the 200% scaling on the laptop screen then reboot it looks correct again.

      • You’re conflating different topics. LCD subpixel rendering and font smoothing is often implemented by treating the subpixels as little rectangles, which is the same mistake as treating pixels as squares.

        Anti-aliasing can be and is done on squares routinely. It’s called ‘greyscale antialiasing’ to differentiate from LCD subpixel antialiasing, but the name is confusing since it works and is most often used on colors.

        The problem Alvy-Ray is talking about is far more subtle. You can do anti-aliasing with little squares, but the result isn’t 100% correct and is not the best result possible no matter how many samples you take. What he’s really referring to is what signal processing people call a box filter, versus something better like a sinc or Gaussian or Mitchell filter.

        Regarding your edit, on a high DPI display there’s very little practical difference bewteen LCD subpixel antialiasing and ‘greyscale’ (color) antialiasing. You don’t need LCD subpixels to get effective antialiasing, and you can get mostly effective antialiasing with square shaped pixels.

        • We're going off on tangents here, I only brought this up to reinforce the idea that a pixel is not a little square.

          And I guess I should have explicitly stated that I'm not talking about high-DPI displays, subpixel rendering obviously doesn't do much good there!

          My point is simply this, if you don't treat pixel like discrete little boxes that display a single color, you can use subpixels to effectively increase the resolution on low-DPI monitors. Yes, you can use greyscale antialiasing instead, you will even get better performance, but the visual quality will suffer on your standard desktop PC monitor.

          • Yes, LCD subpixel rendering is a tangent, it’s not relevant to Alvy-Ray’s point. And again, LCD subpixel rendering is treating pixels as little squares; cutting a square pixel into 3 rectangles doesn’t change anything with respect to what Alvy Ray was talking about. So-called grayscale antialiasing on low-DPI displays is nearly as good as LCD subpixel, the quality differences are pretty minor. (Plus it’s a tradeoff and you get the downside of chromatic aberration.) I think you’re suggesting visual quality suffers when you don’t do any antialiasing at all, which is true, but that’s not what Alvy-Ray was talking about. Treating a pixel like a square does not imply lack of antialiasing.
      • I don't think that's true anymore. Modern high-resolution displays have pixels small enough that they don't really benefit from sub-pixel rendering, and logical pixels have become decoupled from physical pixels to the point of making sub-pixel rendering a lot more difficult.
        • It's still true for everyone who doesn't have a high DPI monitor. We're talking about a feature that doubles the price of the display for little practical value. It's not universal, and won't be for a long time.
    • Agreed. The fact that a pixel is an infinitely small point sample - and not a square with area - is something that Monty explained in his demo too: https://youtu.be/cIQ9IXSUzuM?t=484
      • A pixel is not the sample(s) that is value came from. Given a pixel (of image data) you don't know what samples are behind it. It could have been point-sampled with some optical sensor far smaller than the pixel (but not infinitely small, obviously). Or it could have been sampled with a gaussian bell shaped filter a bit wider than the pixel.

        A 100x100 thumbnail that was reduced from a 1000x1000 image might have pixels which are derived from 100 samples of the original image (e.g. a simple average of a 10x10 pixel block). Or other possibilities.

        As an abstraction, a pixel definitely doesn't represent a point sample, let alone an infinitely small one. (There could be some special context in which it does but not as a generality.)

        • > A 100x100 thumbnail that was reduced from a 1000x1000 image might have pixels which are derived from 100 samples of the original image (e.g. a simple average of a 10x10 pixel block). Or other possibilities.

          And if a downsampling algorithm tries to approximate a point sample, it'll give you a massively increased chance of ugly moire patterns.

          The audio equivalent is that you drop 3/4 of your samples and it reflects the higher frequencies down into the lower ones and hurts the quality. You need to do a low-pass filter first. And "point samples from a source where no frequencies exist above X, also you need to change X before doing certain operations" is very different and significantly more complicated than "point samples". Point samples are one leaky abstraction among many leaky abstractions, not the truth. Especially when an image has a hard edge with frequencies approaching infinity.

          • But from the pixels alone, you don't know whether the moire is an artifact of sampling of something that was free of moire, or whether an existing image of moire was sampled and reproduced.
            • I have the source image. I know the downscaled version looks awful and wrong. I know the naive point algorithm was the wrong one to use.

              Someone with just the small version wouldn't know if it's supposed to look like that, but we're not asking them.

              Unless they can infer it's a picture of normal objects, in which case they can declare it's moire and incorrect.

      • Eh, calling it infinitely small is at least as misleading as calling it a square. While they are both mostly correct, neither Monty’s explanation nor Alvy-Rays are all that good. Pixels are samples taken at a specific point, but pixel values do represent area one way or another. Often they are not squares, but on the other hand LCD pixels are pretty square-ish. Camera pixels are integrals over the sensor area, which captures an integral over a solid angle. Pixels don’t have a standard shape, it depends on what capture or display device we’re talking about, but no physical capture or display devices have infinitely small elements.
        • Camera pixels represent an area, but pixels coming out of a 3D game engine usually represent a point sample. Hand-drawn 2d pixel art is explicitly treating pixels as squares. All of these are valid uses that must coexist on the same computer.
          • > Camera pixels represent area, but pixels coming out of a 3D game engine usually represent a point sample

            Having worked on both game engines and CG films, I think that’s misleading. Point sample is kind of an overloaded term in practice, but I still don’t think your statement is accurate. Many games, especially modern games are modeling pixels with area; explicitly integrating over regions for both visibility and shading calculations. In fact I would say games are generally treating pixels as squares not point samples. That’s what DirectX and Vulkan and OpenGL do. That’s what people typically do with ray tracing APIs in games as well. Even a point sample can still have an associated area, and games always display pixels that have area. The fact that you can’t display a point sample without using area should be reason enough to avoid describing pixels that way.

        • Some early variants of sony a7 had fewer but larger pixels to improve light gathering at high iso.
      • Signal processing engineers against the rest of the world.
  • I'd say it's better to call it a unit of counting.

    If I have a bin of apples, and I say it's 5 apples wide, and 4 apples tall, then you'd say I have 20 apples, not 20 apples squared.

    It's common to specify a length by a count of items passed along that length. Eg, a city block is a ~square on the ground bounded by roads. Yet if you're traveling in a city, you might say "I walked 5 blocks." This is a linguistic shortcut, skipping implied information. If you're trying to talk about both in a unclear context, additional words to clarify are required to sufficiently convey the information, that's just how language words.

    • Exactly. Pixels are indivisible quanta, not units of any kind of distance. Saying pixel^2 makes as much sense as counting the number of atoms on the surface of a metal and calling it atoms^2.
      • chii
        So how does subpixels come into play under this idea of quanta?
        • Pixels then become containers and subpixels become quantfiable entities within each pixel. In the apple analogy, each crate contains three countable apples and you can count both the crates and the apples independently.

          This idea itself breaks down when we get to triangular subpixel rendering, which spans pixels and divides subpixels. But it's also a minor form of optical illusion, so making sense of it is inherently fraught.

          Maybe a pixel is just a pixel.

        • Quarks? They’re sub-units of hadrons but iirc they can’t be found on their own.
        • I think we'll need to use some maths from qcd lattice !
    • That is exactly how it is and it makes the whole article completely pointless. Especially as the article in the second sentence correctly writes "1920 pixels wide".
    • It think the point of the article is that you don't say "5 pixels wide x 4 pixels tall" but just "5 pixels x 4 pixels", though I would say that "5x4 pixels" is the most common and most correct terminology.

      And the article concludes with : "But it does highlight that the common terminology is imperfect and breaks the regularity that scientists come to expect when working with physical units in calculations". Which matches your conclusion.

    • Is it that, or is it a compound unit that has a defined width and height already? Something can be five football fields long by two football fields wide, for an area of ten football fields.
      • This example illustrates potential confusion around non-square pixels. 5 football fields long makes perfect sense, but I'm not sure if 2 football fields wide means "twice the width of a football field" or "width equaling twice the length of a football field". I would lean towards the latter in colloquial usage, which means that the area is definitely not the same as the area of 10 football fields
        • I would lean towards the former. I really don't think people are trying to compare the width to the length when discussing football fields casually.

          If I told you parking spots are about two bowling lane's wide... I'm obviously not trying to say they are 120ft wide.

          • I don't think that's obvious at all if you're talking about the length and the width and not describing something I know to be much smaller in any dimension than the length of a lane.
      • No, it is a count. Pixels can have different sizes and shapes, just like apples. Technically football fields vary slightly too but not close to as much as apples or pixels.
        • Football fields also have the fun property of varying in the third dimension. They're built with a crown in the middle so that water will drain off towards the edges, and that can vary significantly between instances.

          And pixels are even starting to vary in the third dimension too, with the various curved and bendable and foldable displays.

        • Pixel counts generally represent areas by taking the number of pixels inside a region of the plane, but they can represent lengths by taking the number of pixels inside a certain extent of a single line or column of the grid: it is, actually, a thin rectangle.
        • What's the standard size of a city block, the other countable example given by the original author?
          • Yes, city blocks are like pixels or apples. They do not have a standard size or shape.

            Edit: To clarify, if someone says 3 blocks that could vary by like a factor of like 3 or in extreme caesx more so when used as a unit of length it is a very rough estimate. It is usually used in my country as a way to know when you have reached your destination.

    • > If I have a bin of apples, and I say it's 5 apples wide, and 4 apples tall

      ...then you have a terrible bin for apple storage and should consider investing in a basket ;)

      • If you don't care about bruising
  • What a perplexing article.

    Isn't a pixel clearly specified as a picture element? Isn't the usage as a length unit just as colloquial as "It's five cars long", which is just a simplified way of saying "It is as long as the length of a car times five", where "car" and "length of car" are very clearly completely separate things?

    > The other awkward approach is to insist that the pixel is a unit of length

    Please don't. If you want a unit of length that works well with pixels, you can use Android's "dp" concept instead, which are "density independent pixels" (kinda a bad name if you think about it) and are indeed a unit of length, namely 1dp = 158.75 micro meter, so that you have 160 dp to the inch. Then you can say "It's 10dp by 5dp, so 50 square dp in area.".

    • Yeah, this isn't really that complicated. It's just colloquial usage, not rigorous dimensional analysis. Roughly no one is actually confused by either usage ("1920 by 1080" or "12 megapixels").

      It's nearly identical to North American usage of "block" (as in "city block"). Merriam Webster lists these two definitions (among many others):

      > 6 a (1): a usually rectangular space (as in a city) enclosed by streets and occupied by or intended for buildings

      > 6 a (2): the distance along one of the sides of such a block

    • Another colloquial saying to back this up is that "Oh, that house is five acres down the road" or, for a non-standard unit, "The store is three blocks away". We often use area measurements for length if it's convenient.

      The pixel is a unit of area - we just occasionally use units of area to measure length.

      • > Another colloquial saying to back this up is that "Oh, that house is five acres down the road" or, for a non-standard unit, "The store is three blocks away". We often use area measurements for length if it's convenient.

        I have never heard someone use the first instance, and I wouldn't understand what it meant. I mean, I could buy that it meant that there is a five-acre plot between that house and where we are now, but it wouldn't give me any useful idea of how far the house is other than "not too close." Perhaps you have in mind that, since the "width" of an acre is a furlong, a house 5 acres away is 5 furlongs away?

  • Well the way I see them I don't think they are a unit at all.

    And the end pixels are "physical things". Like ceramic tiles on a bathroom wall.

    Your wall might be however many meters in length and you might need however squared meters of tile in order to cover it. But still, if you need 10 tiles high and 20 tiles width, you need 200 tiles to cover it. No tension there.

    Now you might argue that pixels in a scaled game don't correspond with physical objects in the screen any more. That's ok. A picture of the bathroom wall will look smaller than the wall itself. Or bigger, if you hold it next to your face. It's still 10x20=200 tiles.

  • > A Pixel Is Not A Little Square!

    > This is an issue that strikes right at the root of correct image (sprite) computing and the ability to correctly integrate (converge) the discrete and the continuous. The little square model is simply incorrect. It harms. It gets in the way. If you find yourself thinking that a pixel is a little square, please read this paper.

    > A pixel is a point sample. It exists only at a point. For a color picture, a pixel might actually contain three samples, one for each primary color contributing to the picture at the sampling point. We can still think of this as a point sample of a color. But we cannot think of a pixel as a square—or anything other than a point.

    Alvy Ray Smith, 1995 http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf

    • A pixel is simply not a point sample. A camera does not take point sample snapshots, it integrates lightfall over little rectangular areas. A modern display does not reconstruct an image the way a DAC reconstructs sounds, they render little rectangles of light, generally with visible XY edges.

      The paper's claim applies at least somewhat sensibly to CRTs, but one mustn't imagine the voltage interpolation and shadow masking a CRT does corresponds meaningfully to how modern displays work... and even for CRTs it was never actually correct to claim that pixels were point samples.

      It is pretty reasonable in the modern day to say that an idealized pixel is a little square. A lot of graphics operates under this simplifying assumption, and it works better than most things in practice.

      • > A camera does not take point sample snapshots, it integrates lightfall over little rectangular areas.

        Integrates this information into what? :)

        > A modern display does not reconstruct an image the way a DAC reconstructs sounds

        Sure, but some software may apply resampling over the original signal for the purposes of upscaling, for example. "Pixels as samples" makes more sense in that context.

        > It is pretty reasonable in the modern day to say that an idealized pixel is a little square.

        I do agree with this actually. A "pixel" in popular terminology is a rectangular subdivision of an image, leading us right back to TFA. The term "pixel art" makes sense with this definition.

        Perhaps we need better names for these things. Is the "pixel" the name for the sample, or is it the name of the square-ish thing that you reconstruct from image data when you're ready to send to a display?

        • > Integrates this information into what? :)

          Into electric charge? I don’t understand the question, and it sounds like the question is supposed to lead readers somewhere.

          The camera integrates incoming light into a tiny square into an electric charge and then reads out the charge (at least for a CCD), giving a brightness (and with the Bayer filter in front of the sensor, a color) for the pixel. So it’s a measurement over the tiny square, not a point sample.

          • > The camera integrates incoming light into a tiny square [...] giving a brightness (and with the Bayer filter in front of the sensor, a color) for the pixel

            This is where I was trying to go. The pixel, the result at the end of all that, is the single value (which may be a color with multiple components, sure). The physical reality of the sensor having an area and generating a charge is not relevant to the signal processing that happens after that. For Smith, he's saying that this sample is best understood as a point, rather than a rectangle. This makes more sense for Smith, who was working in image processing within software, unrelated to displays and sensors.

            • It’s a single value, but it’s an integral over the square, not a point sample. If a shine a perfectly focused laser very close to the corner of one sensor pixel, I’ll still get a brightness value for the pixel. If it were a point sample, only the brightness at a single point would give an output.

              And depending on your application, you absolutely need to account for sensor properties like pixel pitch and color filter array. It affects moire pattern behavior and creates some artifacts.

              I’m not saying you can’t think of a pixel as a point sample, but correcting other people who say it’s a little square is just wrong.

              • > It affects moire pattern behavior and creates some artifacts.

                Yes. The spacing between point samples determines the frequency, a fundamental characteristic of a dsp signal.

            • It's never a point source. Light is integrated over a finite area to form a singke color sample. During Bayer mosaicking, contributions from neighbouring pixels are integrated to form samples of complementary color channels.
              • > Light is integrated over a finite area to form a singke color sample. During Bayer mosaicking, contributions from neighbouring pixels are integrated to form samples of complementary color channels.

                Integrated into a single color sample indeed. After all the integration, mosaicking, and filtering, a single sample is calculated. That’s the pixel. I think that’s where the confusion is coming from. To Smith, the “pixel” is the sample that lives in the computer.

                The actual realization of the image sensors and their filters are not encoded in a typical image format, nor used in a typical high level image processing pipelines. For abstract representations of images, the “pixel” abstraction is used.

                The initial reply to this chain focused on how camera sensors capture information about light, and yes, those sensors take up space and operate over time. But the pixel, the piece of data in the computer, is just a point among many.

                • > Integrated into a single color sample indeed. After all the integration, mosaicking, and filtering, a single sample is calculated. That’s the pixel. I think that’s where the confusion is coming from. To Smith, the “pixel” is the sample that lives in the computer.

                  > But the pixel, the piece of data in the computer, is just a point among many.

                  Sure, but saying that this is the pixel, and negating all other forms as not "true" pixels is arbitrary. The real-valued physical pixels (including printer dots) are equally valid forms of pixels. If anything, it would be impossible for humans to sense the pixels without interacting with the real-valued forms.

                • IMO Smith misapplied the term "pixel" to mean "sample". A pixel is a physical object, a sample is a logical value that corresponds in some way to the state of the physical object (either as an input from a sensor pixel or an output to a display pixel) but is also used in signal processing. Samples aren't squares, pixels (usually) are.
      • A slightly tangential comment: integrating a continuous image on squares paving the image plane might be best viewed as applying a box filter to the continuous image, resulting in another continuous image, then sampling it point-wise at the center of each square.

        It turns out that when you view things that way, pixels as points continues to make sense.

      • The representation of pixels on the screen is not necessarily normative for the definition of the pixel. Indeed, since different display devices use different representations as you point out, it can't really be. You have to look at the source of the information. Is it a hit mask for a game? Then they are squares. Is it a heatmap of some analytical function? Then they are points. And so on.
      • DACs do a zero-order hold, which is equivalent to a pixel as a square.
        • They contain zero-order holds, but they also contain reconstruction filters. Ideal reconstruction filters have infinite width, which makes them clearly different in kind. And most DACs nowadays are Delta-Sigma, so the zero-order hold is on a binary value at a much higher frequency than the source signal. The dissimilarities matter.
    • The 'point' of Alvy's article is that pixels should be treated as point sources when manipulating them, not when displaying them.

      Obviously, when a pile of pixels is shown on a screen (or for that matter, collected from a camera's CCD, or blobbed by ink on a piece of paper), it will have some shape: The shape of the LCD matrix, the shape of the physical sensor, the shape of the ink blot. But those aren't pixels, they're the physical shapes of the pixels expressed on some physical medium.

      If you're going to manipulate pixels in a computer's memory (like by creating more of them, or fewer), then you'd do best by treating the pixels as sampling points - at this point, the operation is 100% sampling theory, not geometry.

      When you're done, and have an XY matrix of pixels again, you'll no doubt have done it so that you can give those pixels _shape_ by displaying them on a screen or sheet of paper or some other medium.

    • This is one of my favorite articles. Although I think you can define for yourself what your pixels are, for most it is a point sample.
  • This isn't just pixels, it's the normal way we use rectangular units in common speech:

    * A small city might be ten blocks by eight blocks, and we could also say the whole city is eighty blocks.

    * A room might by 13 tiles by 15 tiles, or 295 tiles total.

    * On graph paper you can draw a rectangle that's three squares by five squares, or 15 squares total.

  • The article starts out with an assertion right in the title and does not do enough to justify it. The title is just wrong. Saying pixels are like metres is like saying metres are like apples.

    When you multiply 3 meter by 4 meter, you do not get 12 meters. You get 12 meter squared. Because "meter" is not a discrete object. It's a measurement.

    When you have points A, B, C. And you create 3 new "copies" of those points (by geometric manipulation like translating or rotating vectors to those points), you now have 12 points: A, B, C, A1, B1, C1, A2, B2, C2, A3, B3, C3. You don't get "12 points squared". (What would that even mean?) Because points are discrete objects.

    When you have 3 apples in a row and you add 3 more such rows, you get 4 rows of 3 apples each. You now have 12 apples. You don't have "12 apples squared". Because apples are discrete objects.

    When you have 3 pixels in a row and you add 3 more such rows of pixels, you get 4 rows of 3 pixels each. You now have 12 pixels. You don't get "12 pixels squared". Because pixels are discrete objects.

    Pixels are like points and apples. Pixels are not like metres.

    • > When you multiply 3 meter by 4 meter, you do not get 12 meters. You get 12 meter squared.

      "12 meter(s) squared" sounds like a square that is 12 meters on each side. On the other hand, "12 square meters" avoids this weirdness by sounding like 12 squares that are one meter on each side, which the area you're actually describing.

      • that's just a quirk of the language.

        If you use formal notation, 12 m^2 is very clear. But i have yet to see anyone write 12px^2

        • It's one that really bothers me because of the unnecessary confusion it adds.

          As for the rest, see GGP's argument. px^2 doesn't make logical sense. When people are use pixels as length, it's in the same way as "I live 2 houses over" - taking a 2D or 3D object and using one of its dimensions as length/distance.

  • A pixel is a dot. The size and shape of the dot is implementation-dependent.

    The dot may be physically small, or physically large, and it may even be non-square (I used to work for a camera company that had non-square pixels in one of its earlier DSLRs, and Bayer-format sensors can be thought of as “non-square”), so saying a pixel is a certain size, as a general measure across implementations, doesn’t really make sense.

    In iOS and MacOS, we use “display units,” which can be pixels, or groups of pixels. The ratio usually changes, from device to device.

  • Pixel, used as a unit of horizontal or vertical resolution, typically implies the resolution of the other axis as well, at least up to common aspect ratios. We used to say 640x480 or 1280x1024 – now we might say 1080p or 2.5K but what we mean is 1920x1080 and 2560x1440, so "pixel" does appear to be a measure of area. Except of course it's not – it's a unit of a dimensionless quantity that measures the amount of something, like the mole. Still, a "quadratic count" is in some sense a quantity distinct from "linear count", just like angles and solid angles are distinct even though both are dimensionless quantities.

    The issue is muddied by the fact that what people mostly care about is either the linear pixel count or pixel pitch, the distance between two neighboring pixels (or perhaps rather its reciprocal, pixels per unit length). Further confounding is that technically, resolution is a measure of angular separation, and to convert pixel pitch to resolution you need to know the viewing distance.

    Digital camera manufacturers at some point started using megapixels (around the point that sensor resolutions rose above 1 MP), presumably because big numbers are better marketing. Then there's the fact that camera screen and electronic viewfinder resolutions are given in subpixels, presumably again for marketing reasons.

    • Digital photography then takes us on to subpixels, Bayer filters (https://en.wikipedia.org/wiki/Color_filter_array) and so on. You can also divide the luminance colour parts out. Most image and video compression puts more emphasis on the luminance profile, getting the colour more approximate. The subpixels on a digital camera (or a display for that matter) take advantage of this quirk of human vision.
  • Happens to all square shapes.

    A chessboard is 8 tiles wide and 8 tiles long, so it consists of 64 tiles covering an area of, well, 64 tiles.

    • Not all pixels are square, though! Does anyone remember anamorphic DVDs? https://en.wikipedia.org/wiki/Anamorphic_widescreen
      • Never mind anamorphic DVDs, all of them use non-square pixels. The resolution of DVD is 720×480 pixels (or squared pixels, referring back to the article); this is a 3:2 ratio of pixel quantities on the horizontal vs. vertical axes. But the overall aspect ratio of the image is displayed as either 4:3 (SDTV) or 16:9 (HDTV), neither of which matches 3:2. Hence the pixel aspect ratio is definitely not 1:1.
    • City blocks, too.
      • In the US...
        • Do people in Spanish cities with strong grids (eg Barcelona) not also use the local language equivalent of "blocks" as a term? I would be surprised if not. It's a fundamentally convenient term in any area that has a repeated grid.

          The fact that some cities don't have repeated grids and hence don't use the term is not really a valuable corrective to the post you are replying to.

          • In Slavic languages we think in terms of intersections for distance, maybe the same for Spanish? Area is thought of either as inside district (say city enter) or in meters squared.
            • A block is just the distance from one intersection to the next. Even if those distances vary or are non-square.

              E.g. Manhattan has mostly rectangular blocks, if you go from 8th Avenue to Madison Avenue along 39th St you traveled 4 blocks (the last of which is shorter than the first 3), if you go from 36th St to 40th St along 8th Avenue you traveled 4 blocks (all of which are shorter than the blocks between the avenues).

        • While it is certainly more common in the US we occasionally use blocks as a measurement here in Sweden too. Blocks are just smaller and less regular here.
  • A pixel is two dimensional, by definition. It is a unit of area. Even in the signal processing "sampling" definition of a pixel, it still has an areal density an is therefore still two-dimensional.

    The problem in this article is it incorrectly assumes a pixel to be a length and then makes nonsensical statements. The correct way to interpret "1920 pixels wide" is "the same width as 1920 pixels arranged in a 1920 by 1 row".

    In the same way that "square feet" means "feet^2" as "square" acts as a square operator on "feet", in "pixels wide" the word "wide" acts as a square root operator on the area and means "pixels^(-2)" (which doesn't otherwise have a name).

    • It is neither a unit of length nor area, it is just a count, a pixel - ignoring the CSS pixel - has no inherent length or area. To get from the number of pixels to a length or area, you need the pixel density. 1920 pixel divided by 300 pixel per inch gives you the length of 6.4 inch and it all is dimensionally consistent. The same for 18 mega pixel, with a density of 300 times 300 pixel per square inch you get an image area of 200 square inch. Here pixel per inch times pixel per inch becomes pixel per square inch, not square pixel per square inch.
    • CSS got it right by making pixels a relative unit. Meters are absolute. You cannot express pixels in meters. Because they are relative units.

      If you have a high resolution screen the a CSS pixel is typically be 4 actual display pixels (2x2) instead of just 1. And if you change the zoom level, the amount of display pixels might actually change in fractional ways. The unit only makes sense in relation to what's around it. If you render vector graphics or fonts, pixels are used as relative units. On a high resolution screen it will actually use those extra display pixels.

      If you want to show something that's exactly 5cm on a laptop or phone screen, you need to know the dimensions of the screen and figure out how many pixels you need per cm to scale things correctly. Css has some absolute units but they only work as expected for print media typically.

    • > The correct way to interpret "1920 pixels wide" is "the same width as 1920 pixels arranged in a 1920 by 1 row".

      But to be contrarian, the digital camera world always markets how many megapixels a camera has. So in essense, there are situations where pixels are assumed to be an area, rather than a single row of X pixels wide.

      • The digital camera world also advertises the sensor size. So a 24MP APS-C camera has smaller pixels than a 24MP Full-frame camera, for example.
    • Same as if you were building a sidewalk and you wanted to figure out its dimensions, you’d base it off the size of the pavers. Because half pavers are a pain and there are no half pixels.
    • > in "pixels wide" the word "wide" acts as a square root operator on the area and means "pixels^(-2)"

      Did you meant "pixels^(1/2)"? I'm not sure what kind of units pixels^(-2) would be.

      • pixel^(-2) is "per squared pixel". Analogously, 1 pascal = 1 newton / 1 metre^2. (Pressure is force per squared length.)
    • > A pixel is two dimensional, by definition.

      A pixel is a point sample by definition.

      • An odd definition. A pixel is a physical object, a picture element in a display or sensor. The value of a pixel at a given time is a sample, but the sample isn't the pixel.
        • Definitions of technical things you aren’t familiar with tend to be odd.

          You are referring to a physical piece of a display panel. A representation of an image in software is a different thing. Hardware and software transforms the dsp signal of an image into voltages to drive the physical pixel. That process takes into account physical characteristics like dimensions.

          Oh btw physical pixels aren’t even square and each RGB channel is a separate size and shape.

  • A bathroom tile is also a unit of length and area. A wall can be so many tiles high by so many wide, and its area the product, also measured in tiles.

    It is just word semantics revolving around a synecdoche.

    When we say that an image is 1920 pixels wide, the precise meaning is that it is 1920 times the width of a pixel. Similarly 1024 pixels high means 1024 times the height of a pixel. The pixel is not a unit of length; its height or width are (and they are different when the aspect ratio is not 1:1!)

    A syntax-abbreviating semantic device in human language where part of something refers to the whole or vice versa is called a synecdoche. Under synecdoche, "pixel" (the whole) can refer to "pixel width" (part or property of the whole).

    Just like the synecdoche "New York beats Chicago 4:2" refer to basketball teams in its proper context, not literally the cities.

  • Hopefully most people get that the exact meaning of "pixel" depends on context?

    It certainly doesn't make sense to mix different meanings in a mathematical sense.

    E.g., when referring to a width in pixels, the unit is pixel widths. We shorten it and just say pixels because it's awkward and redundant to say something like "the screen has a width of 1280 pixel widths", and the meaning is clear to the great majority of readers.

  • So, the author answers the question:

    > That means the pixel is a dimensionless unit that is just another name for 1, kind of like how the radian is length divided by length so it also equals one, and the steradian is area divided by area which also equals one.

    But then for some reason decides to ignore it. I don’t understand this article. Yes, pixels are dimensionless units used for counting, not measuring. Their shape and internal structure is irrelevant (even subpixel rendering doesn’t actually deal with fractions - it alters neighbors to produce the effect).

  • This kind of thing is common in english, though. "an aircraft carrier is 3.5 football fields long"

    The critical distinction is the inclusion of a length dimension in the measurement: "1920 pixels wide", "3 mount everests tall", "3.5 football fields long", etc.

  •     But it does highlight that the common terminology is imperfect and breaks the regularity that scientists come to expect when working with physical units in calculations
    
    Scientists and engineers dont actually expect much, they make a lot of mistakes, are not very rigorous, not demanding towards each others. It is common for Units to be wrong, context defined, socially dependent and even sometimes added together when the operator + hasn't been properly defined
  • A pixel is neither a unit of length nor area, it is like a byte, a unit of information.

    Sometimes, it is used as a length or area, omitting a conversion constant, but we do it all the times, the article gives out the mass vs force as an example.

    Also worth mentioning that pixels are not always square. For example, the once popular 320x200 resolution have pixels taller than they are wide.

    • It's not a unit of information. How many bytes does a 320×240 image take? You don't know until you specify the pixel bit depth in bpp (bits per pixel).
  • This article wastes readers' time by pretending to command of a subject in a manner that is authoritative only in its uncertainty.

    Pixel is an abbreviation for 'picture element' which describes a unit of electronic image representation. To understand it, consider picture elements in the following context...

    (Insert X different ways of thinking about pictures and their elements.)

    If there is a need for a jargon of mathematical "dimensionality" for any of these ways of thinking, please discuss it in such context.

    Next up:

    <i>A musical note is a unit of...</i>

  • This depends upon who you ask. CSS defines a pixel as an angle:

    https://www.w3.org/TR/css-values-3/#reference-pixel

    • That's the definition of a "reference pixel", not a pixel. They actually refer to a pixel (and the angle) in the definition.
  • For those who programmed 8-bit computers or worked with analog video, a pixel is also a unit of time. An image is a long line with some interruptions.
  • Also a measurement of life. Back in the 320x200 game days, when playing something with a health bar, we used to joke that someone had one pixel of life left when near death.
  • The pixel ain't no problem.

    A "megapixel" is simply defined as 1024 pixels squared ish.

    There is no kilopixel. Or exapixel.

    No-one doesn't understand this?

  • Reminds me of this Numberphile w/ Cliff Stoll [1]: The Nescafé Equation (43 coffee beans)

    [1] https://youtu.be/3V84Bi-mzQM

  • Or perhaps it's multivariate and there's no point in trying to squish all the nuance into a single solid definition.
  • A Pixel is a telephone.
  • This is a fun post by Nayuki - I'd never given this much thought, but this takes the premise and runs with it
  • Should be pixel as area and pixel-length as 1-dimensional unit.

    So an image could be 1 mega pixel, or 1000 times 1000 pixel-lengths.

  • Pixel is also a unit of a phone.
  • I’m surprised the author didn’t dig into the fact that not all pixels are square. Or that pixels are made of underlying RGB light emitters. And that those RGB emitters are often very non-square. And often not 1:1 RGBEmitter-to-Pixel (stupid pentile).
    • Or the fact that a 1 megapixel camera (counting each color-filtered sensing element as a pixel) generates less information than a 1 megapixel monitor (counting each RGB triad as a pixel) can display.
    • > "Je n’ai fait celle-ci plus longue que parce que je n’ai pas eu le loisir de la faire plus courte."

      or

      > "I have made this longer than usual because I have not had time to make it shorter."

  • Pixel is just contextual. When you are talking about one dimensional things it's a unit of length. In all mother cases it's a unit of area.
  • Pixels are not measurement units. They're samples of an image taken a certain distance apart. It's like eggs in a carton: it's perfectly legitimate to say that a carton is 6 eggs long and 3 eggs wide, and holds a total of 18 eggs, because eggs are counted, they're not a length measure except in the crudest sense.
  • A pixel is a sample or a collection of values of the Red, Green, and Blue components of light captured at a particular location in a typically rectangular area. Pixels have no physical dimensions. A camera sensor has no pixels, it has photosites (four colour sensitive elements per one rectangular area).
    • And what’s the difference between a photosite and a pixel? Sounds like a difference made up to correct other people.
      • A photosite is a set of four photosensitive electronic sensors that register levels of RGB components of light https://www.cambridgeincolour.com/tutorials/camera-sensors.h... The camera sensor turns data captured by a a single photosite into a single data structure (a pixel), a tuple of as many discreet values as there are components in a given colour space (three for RGB).
        • If you want to be pedantic, you shouldn’t say that the photosite has 4 sensors, depending on the color filter array you can have other numbers like 9 or 36, too.

          And the difference is pure pedantry, because each photosite corresponds to a pixel in the image (unless we’re talking about lens correction?). It’s like making up a new word for monitor pixels because those are little lights (for OLED) while the pixel is just a tuple of numbers. I don’t see why calling the sensor grid items „pixels“ is misunderstandable in any way.

          • You are right about the differences in the number of sensors, there may be more. I prefer to talk about photosites, because additional properties like photosite size or sensor photosite density help me make better decisions when I'm selecting cameras/sensors for a photo project. For example, a 24MP M43 sensor is not the same as a 24MP APS-C or FF sensor, even though the image files they produce have the same number of pixels. Similarly, a 36MP FF sensor is essentially the same a 24MP APS-C sensor, it produces image files that contain more pixels from a wider field of view, but the resolution of the sensor stays the same, because both sensors have the same photosite density (if you pair the same lens with both sensors).
        • I didn't think a single photosite was directly converted to a single pixel, there's quite a number of different demosaicing algorithms.

          Edit: Upon doing some more reading it sounds like a photosite or sensel, isn't a group of sensors, but a single sensor, which can pick up r,g,b,.. light - "each individual photosite, remember, records only one colour – red, green or blue" - https://www.canon-europe.com/pro/infobank/image-sensors-expl...

          I couldn't seem to find a particular name for the RGGB/.. pattern that a bayer filter consists of an array of.

    • Is a pixel not a pixel when it's in a different color space? (HSV, XYZ etc?)
      • RGB is the most common colour space, but yes, other colour spaces are available.
  • The author is very confused.
  • Wait till they hear about fluid ounces.