The dynamic range is the reason Tesla know counts photons rather than use traditional camera processing. They basically remove the concept of exposure entirely and simply pass the sensor photon counts to the neural net.
This approach not only simpler as it removes photo processing/encoding but the result is that the NN can operate with a very high dynamic range similar to the human eye and in many cases can be sensitive on the single-photon level.
> They basically remove the concept of exposure entirely and simply pass the sensor photon counts to the neural net.
That sentence does not make sense. There's no such thing as a count without a corresponding interval that count occurred over. That interval is the exposure.
You can of course do lots of (very) short exposures to avoid sensor saturation. That's "just" a movie at a very high frame rate. And then you can post-process this in lots of exciting ways, align the frames, average them, etc, etc.
Yeah that's fair. A CCD sensor basically converts individual photons to electrical charges. What Tesla has said they've done is thrown away all the traditional image signal processing & post-processing which often includes a lot of exposure-related averaging.
You're right though that we don't typically use real-time neural networks that operate based upon spike rate, so an interval needs to be chosen for photon counting which could be considered a kind of exposure and it is critical that the interval be short enough to avoid saturation.
Lol this doesn't make any sense. The dynamic range of a fully sunlit California highway during noon in the summer (I.e. the brightness reading of the darket vs the brightest spot) is wayyyy higher than any existing sensor. You cannot ignore exposure, you have to choose which part of the scene you want within the brightness range that your camera sensor can capture. You will have areas of the scene that clip, in other words areas of the scene that are pure black or white with no data.
You can do bracketed exposures, but that's literally the opposite of ignoring exposure.
Just keep the duration low so that you never saturate the sensor even in bright sunlight and let the NN do the summations.
At a fundamental level it is somewhat akin to bracketing except all that HDR processing/frame matching is performed within the NN rather than a traditional image processing stack.
The NN is better at this anyway since it must already be performing camera/pose motion tracking to correlate what it's seeing from frame to frame.
Counting photons won't keep a camera from being "jammed." Unless you are using a physically perfect polarizing filter, such that each pixel on the sensor only receives photons from the exact angular window, traced back through the lenses, you have a camera that can ultimately be "jammed."
The human eye isn't so great on those terms. But humans can raise their hand to block the sun if it's straight at our eyes.
The crash you referenced occurred in 2016 when they were using radar on the cars and I don't believe they were yet using raw photon counts nor did the NN have any voxel-based memory as it does now.
The big limits of LiDAR are cost, more than anything. There have been dozens of public driving trials where from a functionality level the answer has been positive (apart from traffic lights, the bastards), but nobody wants to buy a solution with a six figure BOM, before integration.
how do you count photons continuously? what... this makes no sense, if you pass "the photon count" you just did a exposure... also how does a photo diode count photons?
This approach not only simpler as it removes photo processing/encoding but the result is that the NN can operate with a very high dynamic range similar to the human eye and in many cases can be sensitive on the single-photon level.