Here's some additional research how I believe the ATFLIR system is functioning.
I think it's easier to understand it by comparing it to something more familiar, at least for me, so I'm comparing that to my Nikon P900 superzoom camera. It has 83x zoom, 35mm eq focal lengths of 24-2000mm (longest currently available), 16 megapixels. It doesn't quite reach To The Stars, but to the moon anyway:
That P900 has a FOV of around 1º, so very close to that 0.7º of the ATFLIR. So we should get a pretty good idea what sort of reach the ATFLIR has from P900 videos hitting that optical zoom limit. Basically I can identify you from miles away with it, and you won't even see I'm there looking at you. Note though that some videos of it utilize additional digital zoom (which maxes out at insane 8000mm), so the moon for example is somewhat smaller than the full screen at the optical limit, beyond that it's digital zooming. But since it has a 4608x3456 sensor, as compared to the 640x480 on the ATFLIR, it can actually provide much more detail from the same distances in good conditions, and even with that digital zoom maxed out, the resulting images still utilize more pixels.
So why does a $700 handheld camera beat a 2m long pod that costs a few millions and weights some 200kg? The most significant reason is visible light vs infrared. IR radiation has a much longer wavelength, meaning the sensor needs to have much larger pixels to avoid diffraction limits:
The fundamental limit to pixel size is determined by diffraction. ... For typical f/2.0 optics at 5μm wavelength, the spot size is 25μm. Because system users prefer some degree of oversampling, the pixel size may be reduced for MWIR applications to dimensions on the order of 12μm.
Infrared Detectors, Second Edition
ATFLIR is operating at that MWIR range, and according to the information I quoted before, the sensor has centerline spacing (I suppose that's close to the pixel size?) of 25μm. Straightforward multiplying that with the resolution would result in a sensor size of 16x12mm. P900 sensor total size is 6.17mm x 4.55mm and the pixels are 1.34μm. Fullframe DSLR cameras have a sensor size of 36x24mm.
So those P900 pixel sizes wouldn't work with infrared. Bigger pixels can also gather much more light, so the individual pixels are much more accurate and have less noise, especially in low light conditions. So why didn't they use a bigger sensor on the P900, or on the ATFLIR for that matter? Here's why, this is an example of a dSLR lens with 1600mm focal length and 1.5º FOV, so still not quite up to the P900 in terms of reach, but it costs about the same as the ATFLIR and it's not that far from it size and weight-wise either:
World's Most Expensive Camera Lens
That's the crux of the matter. Making those tiny sensors just a little bit larger (either by increasing pixel sizes or adding more of them with the same size) increases the size and weight of the required optics a whole lot with long zoom systems. It's always a compromise, even for something as pricey, big and heavy as the ATFLIR, and that low resolution seems to be close to the sweet spot for such IR system.
But if most of that compromise is caused by the IR wavelengths, and the ATFLIR can also image in visual wavelengths, couldn't it use better resolution for that, especially in good lighting conditions? I believe it could, and it could explain this quote:
An infrared targeting system his planes used magnified an image in the night by 30 times. In daylight a television camera can magnify it by 60 times.
Several sources explicitly state the FOV figures are for MWIR operation (and that is the main function), and Raytheon mentions it has "Common optical path", which means the optics are (at least for the most part) common for IR and visible light/electro-optical (EO). But the sensors have to be different in any case, and that Indium antimonide (InSb) sensor they are touting seems to be commonly used in thermal imaging, not with visible light. So, I believe they haven't actually told us what the sensor is for the visible spectrum, and based on the quote above, and similar information elsewhere, it seems plausible that it might have double the resolution, and hence possibility to use only the center part of that sensor to achieve further 2x (optical) zoom with the same pixel count as IR.
Another possibility would be that it actually had something akin to DSLR teleconverters, so basically some sort of additional lens that could be added to the imaging path at some point. That seems also plausible. While the P900 can freely adjust the zoom to any position by extending the lens barrel and moving lenses with motors, the ATFLIR works with switch in mirrors, meaning it can't do a continuous zoom but just switches from one fixed zoom level to another. Maybe there's one extra for the visible light.
I haven't found any information if one of these is the case, but I think it very well could be. In my opinion, those are better explanations for that 30/60x difference than the FOV being 1.5 for the IR, as it just wouldn't make sense to have the medium FOV having a digital zoom that would create the same result with worse resolution. So all in all, it seems almost certain the IR NAR FOV is that 0.7º.
And as I mentioned before, those 30/60x zoom figures are probably relative to the "optional navigation FLIR", for which 21x21º would be a typical FOV, and that 30x is the ratio between that and the narrowest optical FOV of 0.7ºx0.7º.
Edit: I just found a report made in association with Raytheon, that seems to confirm some of the things I just wrote:
Strapped to manned aircraft or aerial drones, these multispectral sensors operate in multiple modes – usually with both day (electro-optical camera) and night (infrared camera) capability – to provide ground forces critical, time-sensitive information about the insurgent hiding around the corner or entering a town by vehicle. Both sensor types are typically equipped with high-magnification optical lenses that may provide zoom capability. They may also have laser rangefinders or designator/rangefinders to help identify targets.
...
Overtime, as sensors have decreased in size and increased in resolution, more and more can be packed into one turret. Wide Area Airborne Surveillance (WAAS) systems, which are now being developed by both the US Air Force and US Army, can have as many as nine sensors packaged on the turret. Sierra Nevada Corporations’ Gorgon Stare payload, for example, houses five EO cameras and four IR cameras.
...
In addition, image resolution improved with the advent of high-definition (HD) TV. Both electro-optical (Charged Coupled Device TV) and infrared (thermal imaging) cameras have benefitted from HD technology, which increases the number of pixels in a sensor’s array to improve image resolution. In particular, focal plane arrays have evolved from a 320 x 240 format to 640 x 480 pixels, and now, HD array formats of 1,920 × 1,080 pixels
...
Thermal sensors in particular have undergone some signifi- cant improvements in recent years, both in terms of the materials from which they are made and the process by which they operate. Most sensor turrets incorporate staring focal plane array (FPA) thermal imagers, which often operate in either the mid-wave infrared (MWIR 3-5 μm) or long-wave infrared (LWIR - 8-12 μm) spectral ranges, depending on the mission set.
https://www.pilotopolicial.com.br/Documentos/Artigos/Airborne-Imaging-in-2011.pdf
There are also further examples of other products with different FOV setups for IR and visible, and also separate data for high def color (so ATFLIR can even have a separate sensor for that):
CCD television camera: 2° NFoV; 8° WFoV
Thermal Imaging Sensor: 2.8° NFoV; 10° WFoV
...
Thermal Imager: 30° to 0.45°
Daylight camera: matched to thermal imager field of views
...
Colour high definition: 29º to 0.25º
Colour low light high definition: 55º to 1.5º
Short wave infrared: 28º to 0.25º
Thermal imager: 30º to 0.25º
...
Field of view High Magnification Thermal Imager:
NTSC: 31.7 to 0.43° (in four stages)
PAL (large format): 31.8°, 6.51°, 1.30°, 0.52°
PAL (small format): PAL 19.4°, 3.91°, 0.78°, 0.52°
Color daylight TV with zoom lens: 27.4 to 1.4° FoV with optional ×19 lens with ×2 extender: 30.3 to 0.86°
TV Camera with spotter lens (optional): 0.29° or 0.39°
That last one is particularly interesting, as it apparently actually has something akin to 2x tele-converter/extender. Those also confirm the best ones have considerably smaller FOVs than 0.7°, so it isn't anything exceptional.
Edit: Some further information and confirmation how it works:
The ATFLIR system consists of 25 different Weapons Replaceable Assemblies (WRAs). The WRAs are listed below and described in the following paragraphs:
• Electro-optical sensor unit (EOSU)
...
• Advanced Navigation Forward Looking Infrared (ANFLIR) sensor
...
Housed within the EOSU are the ATFLIR midwave IR receiver, gimbal-mounted telescope, laser spot tracker, and visible electro-optical (EO) camera. All optical components are mounted on a one-piece, beryllium aluminum optical bench. The bench was designed to eliminate alignment errors when individual optical components are removed for maintenance. The outer structure of the EOSU is designed to withstand the wind loads of mach plus velocities associated with high-speed aircraft. The outer structure includes the windscreen, multispectral, and laser spot tracker windows. The optical bench is suspended in the outer structure on four vibration isolators and gimbals.
...
Advanced Navigation Sensor
The ANFLIR sensor is a self-contained FLIR imaging system that provides IR imagery (Figure 6-9) used by the operator to maneuver and navigate safely at low altitudes and high air speeds. The imagery delivered by the ANFLIR sensor is comparable to flying during daylight operations while operating at night.
...
Infrared Video
This subsystem provides the IR video (Figure 6-12) for the tactical aircrew display. IR, visible, and laser energy enters the telescope and is relayed off the pitch gimbal to beam splitters on the optical bench. The separated IR energy passes through the relay, the derotation mechanism, and then through the imager to the 640 X 480 element array. Nonuniformities in the raw image are corrected by the digital nonuniformity and scene-based digital nonuniformity modules before reticules are added by the video processor (VP). The VP also provides manual and automatic gain, level, and polarity control and then converts the digital video to standard RS-170 analog video for display in the cockpit
...
Electro-Optical Video
This subsystem provides visible imagery for use by the operator. The EO camera is boresighted to the FLIR and laser optical path to ensure accuracy. Visible energy is separated from the laser and IR spectrums by beamsplitters and routed to a charge coupling device (CCD) camera. The CCD camera contains a mechanism to ensure the image being displayed maintains the correct horizon orientation. Video for the CCD camera is digitally corrected before being routed to the VP, where reticules are Figure 6-12 — IR video. 6-11 added and control functions implemented before being converted to RS-170 analog video. The EO output utilizes the same video lines to the cockpit displays as the IR video subsystem.
...
Field of View and Zoom
There are three levels of optical FOV available for the operator using the ATFLIR system. They are the wide, medium, and narrow FOVs. The wide FOV is optically fixed at 1X magnification. The medium and narrow FOVs are optically fixed at 1X with a 2X magnification zoom capability. All three FOVs are implemented in the reflective telescope of the EOSU with switch in mirrors.
...
Advanced Navigation
The ANFLIR subsystem (Figure 6- 14) is a separable WRA that mounts inside the pod adaptor unit. The ANFLIR WRA provides the operator with the navigation capabilities that were described earlier. When installed, the WRA receives power and cooling air from the pod adaptor unit. This subsystem uses a dedicated RS-170 connection to provide navigational video.
http://navybmr.com/study material/NAVEDTRA 14030A.pdf
Advanced Navigation Sensor The ANFLIR (Figure 7-22) sensor unit is a single WRA mounted at the forward end of the PAU. The IR targeting sensor presents real-time passive thermal imagery that can be placed on the HUD to provide a 1:1 overlay with the real world view. The imagery provides video for day or night detection, tracking, and designation of land or sea targets while maneuvering and navigating safely at low altitudes and high air speeds. The imagery delivered by the ANFLIR sensor is comparable to flying during daylight operations while operating at night.
http://navybmr.com/study material/NAVEDTRA 14029A.pdf
So the navigation flir (ANFLIR) is a separate component, and has the same FOV as the HUD. Now I only need to find if the HUD FOV is 21°...