Hiding secret codes in light protects against fake videos

99 points by CharlesW 15 hours ago | 69 comments

mpascale00
Without having read into this deeper, it sounds like someone could take an original video which has this code embedded as small fluctuations in luminance over time and edit it or produce a new video, simply applying the same luminance changes to the edited areas/generated video, no? It seems for a system like this every pixel would need to be digitally signed by the producer for it to be non-repudiable.
- crazygringo
  Exactly, that is my question too. If you can detect the lighting variations to read and verify the code, then you can also extract them, remove them, reapply to the edited version or the AI version... varying the level of global illumination in a video is like the easiest thing to manipulate.
  Although there's a whole other problem with this, which is that it's not going to survive consumer compression codecs. Because the changes are too small to be easily perceptible, codecs will simply strip them out. The whole point of video compression is to remove perceptually insignificant differences.
  janaagaard
  As I understand it, the brilliant idea is that the small variantions in brightness of the pixels look just like standard noise. Distinguishing the actual noise from the algorithm is not possible, but it is still possible to verify that the 'noise' has the correct pattern.
  cwmoore
  Correct pattern for the correct time span matching random fluctuations in the electrical grid.
  kumarvvr
  I think that will be handled by the AC to DC conversion in most systems.
- lowbloodsugar
  Academics presenting the opening move in a game of white hat / black hat thinking the game is over after one turn.
- tripdout
  The code embedded into the luminosity is sampled from a distribution resembling the noise already present in the video.
  Plus, the code gives information about the frame it's embedded into, so you still have more work to do.
  mustyoshi
  Doesn't this just fall apart if a video is reencoded? Something fairly common on all video platforms.
- TeeMassive
  Not if you encode a cryptographic signature in the watermark
  yapyap
  what would that change
  zeta0134
  The general idea is for the signature to be random each time, but verifiable. There are a bajillion approaches to this, but a simple starting point is to generate a random nonce, encrypt it with your private key, then publish it along with the public key. Only you know the private key, so only you could have produced the resulting random string that decodes into the matching nonce with the public key. Also, critically, every signature is different. (that's what the nonce is for.) If two videos appear to have the same signature, even if that signature is valid, one of them must be a replay and is therefore almost certainly fake.
  (Practical systems often include a generational index or a timestamp, which further helps to detect replay attacks.)
  I think for the approach discussed in the paper, bandwidth is the key limiting factor, especially as video compression mangles the result, and ordinary news reporters edit the footage for pacing reasons. You want short clips to still be verifiable, so you can ask questions like "where is the rest of this footage" or "why is this played out of order" rather than just going, "there isn't enough signature left, I must assume this is entirely fake."
  SamBam
  But the point is that you'd be extracting the nonce from someone else's existing video of the same event.
  If a celebrity says something and person A films a true video, and person B films a video and then manipulates it, you'd be able to see that B's light code is different. But if B simply takes A's lighting data and applies it to their own video, now you can't tell which is real.
  DoctorOetker
  I am not defending the proposed method, but your criticism is not why:
  Lets assume the pixels have an 8-bit luminance depth, and lets say the 7 most significant bits are kept, and the signature is coded in the last bit of the pixels in a frame. A hash of the full 7-bit image frame could be cryptographically signed, while you could copy the 8-th bit plane to a fake video, the same signature will not check out according to a verifying media player, since the fake video's leading 7-bit planes won't hash to the same hash that has been signed.
  What does this change compared to status quo? nothing: you can already hash and sign a full 8-bit video, and Serious-Oath that it depicts Real imagery. Your signature would also not be transplantable to someone elses video, so others can't put fake video in your mouth.
  The only difference: if the signature is generated by the image sensor, and end-users are unable to extract the private key, then it decreases the number of people / entities able to credibly fake a video, but provides great power to the manufacturers to sign fake videos while the masses are unable to (unless they play a fake video on a high quality screen being imaged by a manufacturer-privatekey-containing-image-sensor.
snickerbockers
Even if everything they say is true, that wouldn't prove a video is fake, at best it proves a video is real. If people will accept "our high-profile defendant in the segregated housing unit of a maximum security prison hung himself with a makeshift noose fashioned from blankets off a bedpost that isn't even as tall as he is while the guards were playing 3d space cadet pinball and the camera was broken and his cellmate was in solitairy", surely they will accept "our maintenance guy used regular lightbulbs from home depot instead of the super secure digital signature bulbs".
Or maybe "we installed the right bulbs but then we set the cameras to record in 240p MPEG with 1/5 keyframe per second because nobody in the office understands how digital video works".
Anyways I'm of the opinion that the ultimate end-state of deep fakes will be some sort of hybrid system where the AI creates 3d models and animates a scene for a traditional raytracing engine. It lets the AI do what its best at (faces, voices, movement) and eliminates most of the random inconsistencies. If that happens then faking these light patterns won't be difficult at all.
- KumaBear
  I will argue one point. People think guards sleeping all shift is part of the conspiracy. This is the reality of the majority of jails and even law enforcement. I’d be more surprised if they were awake not scamming. It’s very common. (Experience in the profession)
  snickerbockers
  I agree, but the thing about the Epstein scandal is that just about any individual aspect of it could plausibly be a coincidence when examined in isolation. It's only when you look at the entire scandal at once that it becomes obvious something is seriously wrong due to the enormous piles of "coincidences" this man leaves in his wake everywhere he goes.
3036e4
Have a vague memory of some old HN discussion about how known fluctuations in light because of slightly varying electricity frequency have been used already to detect fake video and that databases exist with information about frequencies by location and time for this purpose?
- tantalor
  https://pmc.ncbi.nlm.nih.gov/articles/PMC9304164/
  > Electric network frequency is a signal unique over time and thus can be used in time estimation for videos.
- _neil
  Might have been audio?
  https://phys.org/news/2018-02-power-grid-fluctuations-hidden...
  3036e4
  Sibling comment and article mentioning "Electric network frequency (ENF)" lead to further reading mentioning both audio and visual fluctuations being used.
  _neil
  Ah you’re right. Pretty cool.
thewanderer1983
Reminds me of the 'mains hum' technique that was used to identify videos. Can also be done with light.
https://www.youtube.com/watch?v=e0elNU0iOMY
https://en.wikipedia.org/wiki/Electrical_network_frequency_a...
ztown
I'm finding that AI seems incapable of generating aperiodic monotile designs. I suspect this is because the shape is nowhere in any training data, and it doesn't pattern--so without patterns to train on, it produces obvious errors. It invents geometry that stands out like a sore thumb. I think it has potential to serve as protection against deepfakes. I made an online store around all this, but I haven't really advertised it because I'd like a little more confirmation before I run with it. Would love some feedback on the idea
ewidar
While it does not seem enough to guarantee authenticity, this scheme does seem like it would prevent creating a video from scratch pretending to be taken at a protected location without having express knowledge of the key or the flickering at that moment in time.
Definitely interesting for critical event and locations, but quite niche.
- SoftTalker
  My question would be, who does have "express knowledge of the key or the flickering at that moment in time" and are they trustworthy?
neilv
> “Each watermark carries a low-fidelity time-stamped version of the unmanipulated video under slightly different lighting. We call these code videos,”
If this is the only info that's encoded, then that might not be an entirely bad idea.
(Usually, the stego-ing of info can help identify, say, a dissident who made a video that was critical of a regime. There are already other ways, but defeating them is whack-a-mole, if universities are going to keep inventing more.)
> Each watermarked light source has a secret code that can be used to check for the corresponding watermark in the video and reveal any malicious editing.
If I have the dissident video, and a really big computer, can I identify the particular watermarked light sources that were present (and from there, know the location or owner)?
- TeeMassive
  This is apparently how they located Ben Laden in Pakistan, usisng drones and watermarked sounds.
  neilv
  My last question is a bit different. What if there is a huge number of these watermarked light sources deployed, such as in consumer products, each with a unique code that could be verified against... can you identify which one/few of those millions/billions of devices were present?
  (Once you have an identifying code, you can go through supply chain and sales information, and through analysis of other videos, to likely determine location and/or owner/user/affiliate.)
  ada1981
  The sound of building 7 falling with no plane crash…
  stirfish
  The spire tore a hole through the side of the building and it was left burning for hours.
  https://www.youtube.com/watch?v=KMvCWFCoVN4
  mschuster91
  Nope. They used a fake vaccination drive to obtain DNA samples [1], which led to serious distrust in anything "public health" even up until Covid [2].
  [1] https://en.wikipedia.org/wiki/CIA_fake_vaccination_campaign_...
  [2] https://www.npr.org/2021/09/06/1034631928/the-cias-hunt-for-...
yodon
It's rare that I think an academic paper from a good school that is trending on HN is actively stupid, but this is that paper.
If you're even considering going to go to all the trouble of setting up these weird lights and specialized algorithms for some event you're hosting, just shoot your own video of the event and post it. Done.
"Viewers" aren't forensic experts. They aren't going to engage with this algorithm or do some complex exercise to verify the private key of the algorithm prior to running some app on the video, they are just going to watch it.
Opponents aren't going to have difficulty relighting. Relighting is a thing Hollywood does routinely, and it's only getting easier.
Posting your own key and own video does nothing to prove the veracity of your own video. You could still have shot anything you want, with whatever edits you want, and applied the lighting in software after the fact.
I'm sure it was fun to play with the lights in the lab, but this isn't solving a problem of significance well.
- davidee
  I think you might have misunderstood some core use cases.
  One significant problem currently is long form discussions which are taken wildly out of context for the sake of propaganda, cancelling or otherwise damaging the reputation of those involved. The point isn't that a given video isn't edited originally, but that the original source video can be compared to another (whether the original was edited or not is neither here nor there).
  I'm not saying this solution is the answer, but attempts to be able to prove videos were unedited from their original release is a pretty reasonable goal.
  I also don't follow where the idea that viewers need to be forensic experts arises from? My understanding is that a video can be verified as authentic, at least in the sense of the way the original author intended. I didn't read that users would be responsible for this, but rather that it can be done when required.
  This is particularly useful in cases like the one I highlighted above; where a video may be re-cut to make an argument the person (or people) in question never made (and which might be used to smear said persons–a common occurrence in the world of long form podcasting as an example).
  socalgal2
  It would be interesting to know if you could write software to take a video with these flashes in it, post-process them out, morph the video to be taken from another angle, add in a different signature. Then claim the first video is fake and that the 2nd video is the true unedited version.
  Total Relighting SIGGRAPH Talk: https://www.youtube.com/watch?v=qHUi_q0wkq4
  Physically Controllable Relighting of Photographs: https://www.youtube.com/watch?v=XFJCT3D8t0M
  Changing the view point post process: https://www.youtube.com/watch?v=7WrG5-xH1_k
  bee_rider
  It would be pretty cool to live in that word, where a maliciously edited video can be met with a better verified, full version of it.
  I don’t think that’s where we are, right? People are happy to stop looking after they see the video that confirms their negative suspicions about the public figure on the other team, and just assume any negative clips from their own team are taken out of context.
  ahofmann
  While I don't know if the paper is "stupid", or not, I think nobody in the last two decades has ever seen an uncut interview. So I don't see how this light would help or proof anything.
  bee_rider
  I think it is a current propaganda or messaging strategy: you say “In the uncut recording of the interview, I made really good points, but they spliced it up to make me look stupid,” or “In the uncut version of the interview, my opponent said a bunch of nonsense, but they cut it out.” This works because the broadcaster isn’t going to play the uncut version, and even if they did, nobody would bother watching it.
- zahlman
  Even in a world where the common folk all accepted that such watermarking was a real phenomenon, they wouldn't ever verify it themselves. Even if they wanted to verify it themselves, there would need to be a chain of trust to actually verify what the watermark should be. And in the circles where fake videos circulate, that chain of trust will be distrusted, too.
- skhameneh
  This can be used for automated detection and flagging.
  I’m under the impression this isn’t for end users, it’s for enforcement within context of intellectual property.
  I’m curious to see what the value proposition is as it’s unclear who would be buying this and why. I suppose platforms might want it to prove they can help or offer services to enforce brand integrity, maybe?
- gblargg
  The central problem seems to be that the people who are in a position to benefit from claiming something is fake that's actually real are the same ones you have to trust to determine whether it's fake, since the viewer can't determine that (even if they provide a black-box program that supposedly checks this, you can't know what it really does so the same trust problem exists). Maybe this would be useful for a while in an organization to be sure employees aren't using editing and tools on video.
- GauntletWizard
  Yes, I think that the state of modern video generation has made an uncomfortable truth more clear - All Evidence is hearsay, only as trustworthy as the people you're getting it from. For a brief shining moment video evidence was easy to produce but hard to forge, but that's not been the case for most of history. That's why the law has so much detail about evaluating the trustworthiness of witnesses.
V__
I applaud the idea to mark videos as real, but somehow I don't think it matters. People disagree on facts and reality and ignore contradictions or proof to the contrary. If fact-cheking is already used as a slur or dog whistle in some circles, then what can a reality-watermark accomplish?
cluckindan
If it’s not directly human-verifiable, people have to rely on 3rd party tools or official/media statements to verify content legitimacy. Such reliance requires trust in authorities and media, which have both been subject to systematic erosion as of late.
I don’t see the point of this technology. It might be useful for entities like Meta and Google, which could use it to warn of fake content. However, in practice that amounts to giving those entities more power over our perceptions and the realities we build upon them.
dragonwriter
So, if the technical end of this really works, whoever controls the venue and its records can control what videos (real or fake) are considered “real”, in the rare circumstances where the kind of forensics needed to verify it would be employed.
Unfortunately, in most of the cases where that would be useful, that's also a party pretty high on the list of “who are we concerned about manipulating video”, so...
armchairhacker
A more accessible thing that protects against fake videos, at least in the short term, is multiple cameras and a complicated background.
Maybe eventually we get a model that can take a video and "rotate" it, or generate a 3D scene that can be recorded at multiple angles. But maybe eventually we may get a model that can generate anything. For now, 4o can't maintain obvious consistency with so many details, and I imagine it's orders of magnitude harder to replicate spatial/lighting differences accurately enough to pass expert inspection.
If you want solid evidence that a video is real, ask for another angle. Meanwhile, anything that needs to be covered with a camera (security or witness) should have at least two.
dsign
The problem of authenticating footage has a disappointing but sufficient solution:
1.- Use filming devices that sign the footage with a key, and the device has some anti-tamper protections to prevent somebody from stealing the key.
2.- The thing above is useless for most consumers of the footage, which would only see it after three to four transcodings change the bytes beyond recognition. But in a few years everybody will assume most footage in the Internet is fake, and in those cases when people (say, law enforcement) want to know for sure if something really happened, they’ll have to go through the trouble of procuring and watching the original, authenticated file.
The alternatives to 1 and 2 are:
a) To engage in an arms race, like the one which is happening with captchas right now.
b) To roll back this type of AI.
b is not going to happen, even with a societal collapse and sweeping laws against GenAI, because the way GenAI works is widely known. Unless we want to roll back technology itself and stop producing the hardware, and culture so that people don’t know how to do any of this.
- a2128
  The whole "digital cameras with unhackable authenticity signatures" idea can kind of be broken by just pointing a camera at a screen. It sounds a bit ridiculous but screen density and dynamic range is quite high these days, with the right setup you could play a deepfake video and get it to look real AND have a signature saying it's totally authentic
- thatoneguy
  Film worked for a long time (and still works) despite the availabilty of film printers.
lazyeye
The entire mainstream media will have no business model if they are not able to edit live video to support their preferred narrative.
__MatrixMan__
I'm not sure that raising the bar for how expensive it is to fake a video is the right step to take. That'll just preserve, for a little while longer, the false sense of security that comes from seeing a video and assuming it's real. Meanwhile, the well funded bad actors will remain uninhibited (and aren't they the scariest ones?)
It looks like fun research, but I think we'd get a lot better bang for our buck persuring ways to attach annotations to a video like:
> I was there and I saw this happen
...such that you can find transitive trust paths from yourself to a verifier who annotated the video. That'll require a bit of trust hygiene that the common folk aren't prepared for, but I dont think there's any getting around preparing them for it.
do_not_redeem
I thought we were finally getting away from that "subtly flickering fluorescent lights" vibe in public spaces that gives 30% of the population headaches. But I guess we're bringing it back. Another victory for AI!
ranger_danger
Nope, it just means the faker has more work to do.
I don't think there's any possible solution that cannot also be faked in itself.
- xandrius
  Of course it would, the same way encrypting data works.
  Encrypt some data in the video itself (ideally every frame changing), unique and can be created only by the holder the private key. Anyone can verify it. Flag reused codes. That's it?
  vorgol
  I have hitherto not timestamped or cryptographically signed my light sources, but that's something I'll be looking into.
  ARob109
  Wonder if you could measure your breathing rate and heartbeat and cryptographically sign the time series data as ground truth. Then post process the video with Eulerian Video Magnification to recover the values and compare.
  edit forgot the link: https://people.csail.mit.edu/mrub/vidmag/
  wongarsu
  Might be interesting if you are a high-value individual. Maybe in the future we will see a secret service member shining a light on the POTUS at all times to ensure that no fake video of the President can be circulated. Maybe with a scheme where they publish the keys used after each day, to build trust and make sure anyone can verify the authenticity of any video containing the President
  Or anyone else who cares enough about deepfakes and can afford the effort
  kevinventullo
  I’m not sure I understand. Could someone not take an existing legitimate video, light and all, then manipulate it to e.g. have the president saying something else?
  wongarsu
  If you don't manipulate the visual part, lip movements wouldn't match up to what's said. If you do manipulate it that now has to respect the super special light. I don't think it'd be impossible, but it'd be far harder than a regular deepfake. And even if you succeed (or someone writes good software that can do it) the white house can still point to the original video to show that the two were presumably taken at the same time, so one of them must be fake.
  It'd agree that it's a lot of effort for very marginal gain
  do_not_redeem
  The codes from the OP are just flashes of light in the environment. The attacker could read the codes and overlay them onto another video, without needing to decrypt them. That's just a standard replay attack.
  If you flag a reused code in 2 different videos, how do you tell which video is real?
  xandrius
  Well, the code wouldn't be representative of the new frame, right?
  For example, you encrypt the hash of the frame itself (+ metadata: frame number, timestamp, etc.) with a pkey. My client decrypts the hash, computes the hash and compares it.
  The problem might present itself when compressing the video but the tagging step can be done after compression. That would also prevent resharing.
  zhivota
  The light source could be connected to a clock and the flashes represent the encryption of the time using a private key, verifiable using a public key.
  It's a lot of complexity, so probably only worthwhile for high value targets like government press conference rooms, etc.
  do_not_redeem
  That still doesn't help, because the flashes are independent of the content of the video. To illustrate:
  echo "This comment was posted at 18:21 UTC" | sha256sum 4f51109e71ec4df85a52affec59a9104837664be3008d1bd70cb8b4fbe163862 -
  You could easily copy those flashes of light into your next comment if you wanted, without reversing the hash.
  hamburglar
  From the paper:
  “ rather than encoding a specific message, this watermark encodes an image of the unmanipulated scene as it would appear lit only by the coded illumination”
  They are including scene data, presumably cryptographically signed, in the watermark, which allows for a consistency check that is not easily faked.
  do_not_redeem
  That's just saying that the coded image will only be apparent in the areas of the image lit by the light. Which is obvious, that's how a flashlight works too. They're not signing the actual pixels or anything. They've increased the difficulty to that of 3D-mapping the scene and transferring the lighting: not trivial, but still two long-studied problem spaces.
  zhivota
  Hmm yeah fair point. I'm not sure you can do it without some control over the observer device then... will we have "authenticated cameras" soon, with crypto in secure elements? Feels like we'll have to go there to have any trust in video.
  do_not_redeem
  Not soon, we've had them for a long time. Here's one time one of those systems was hacked... 15 years ago. https://www.elcomsoft.com/news/428.html
  It turns out if you give an adversary physical access to hardware containing a private key, and they are motivated enough to extract it, it's pretty hard to stop them.
  twodave
  I suppose the verification algorithm would need to also include a checksum that is basically a hash of the frame’s pixels. So not impossible to fake but also not practical to do so.
  ranger_danger
  I don't think encryption is comparable to a simple duplication of data.
- edm0nd
  kinda like captchas. they really do nothing to prevent someone from continuing to scrape data or do something malicious. it only slows them down or makes them spend $2.99 per 1000 successfully solved captchas or less.