GitHub CLI Authorization with a Fine-grained Access Token

It is a good idea to use fine-grained access tokens for shared PCs in the lab that require access to private GitHub repos so that you can restrict the scope of their use to specific repositories and not use your own personal SSH keys on the shared machines. I am experimenting with the GitHub command line tool gh to authenticate with GitHub using fine-grained access tokens and make common remote operations on repos easier.

Today I encountered a subtle problem in the gh authentication process. If you set the protocol to ssh during login, then you will not have access to the repos that you granted permissions to in the fine-grained access token. This can lead to a lot of head scratching because it's not at all clear which permissions map to which git operations. In other words, what you think is a specific permissions error with the token is actually an authentication error.

To avoid the problem, be sure to specify https and not ssh as the protocol during authentication:

 echo "$ACCESS_TOKEN" | gh auth login -p https --with-token

Raspberry Pi I2C Quickstart

Below are my notes concerning the control of a Sparkfun MCP4725 12-bit DAC over I2C with a Raspberry Pi.

Rasbperry Pi Setup

  1. Enable the I2C interface if isn't already with raspi-config. Verify that the I2C device file(s) are present in /dev/ with ls /dev | grep i2c. (I had two files: i2c-1 and i2c-2.)
  2. Install the i2c-tools package for debugging I2C interfaces.
sudo apt update && sudo apt install -y i2c-tools

i2cdetect

Attach the DAC to the Raspberry Pi. The pinout is simple:

Raspberry Pi MCP4725
GND GND
3.3V Vcc
SCL SCL
SDA SDA

Next, run the command i2cdetect -y 1. This will check for a device on bus 1 (/dev/i2c-1) and automatically accept confirmations:

leb@raspberrypi:~/$ i2cdetect -y 1
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: 60 -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --

Each I2C device must have a unique 7-bit address, i.e. 0x00 to 0x7f. The ranges [0x00, 0x07] and [0x78, 0x7f] are reserved. The above output indicates the DAC is at address 0x60. (Rows are the value of the first hexadecimal number of the address, columns are the second.)

i2cset

i2cset is a command line tool that is part of i2c-tools and that is used to write data to I2C devices. I can set the voltage output of the DAC to 0 as follows:

i2cset -y 1 0x60 0x40 0x00 0x00 i

The arguments mean the following:

Command byte

The command byte is explained on pages 23 and 25 of the MCP4725 datasheet. From most-significant to least-significant bits, the bits mean:

  1. C2 : command bit
  2. C1 : command bit
  3. C0 : command bit
  4. X : unused
  5. X : unused
  6. PD1 : Power down select
  7. PD0 : Power down select
  8. X : unused

According to Table 6-2 and Figure 6-2, C2, C1, C0 = 0, 1, 0 identifies the command to write to the DAC register and NOT also to the EEPROM. In normal operation, the power down bits are 0, 0 (page 28).

So, to write to the DAC register, we want to send 0b01000000 which in hexadecimal is 0x40.

Data bytes to voltage

The data bytes are explained in Figure 6-2 of the datasheet. The first byte contains bits 11-4, and the second byte bits 3-0 in the most-significant bits:

D11 D10 D9 D8 D7 D6 D5 D4 | D3 D2 D1 D0 X X X X

12-bits are used because this is a 12-bit DAC. The mapping between bytes and voltage is:

Data bytes, hex Data bytes, decimal Voltage
0x00 0x00 0 0
0xFF 0xF0 65520 V_max

where V_max is the voltage supplied to the chip's Vcc pin (3.3V in my case). The output step size is \( \Delta V = V_{max} / 4096 \) or about 0.8 mV.

Control via Python

This is modified from Sparkfun's tutorial and uses the smbus Python bindings. Be aware that the tutorial example has a bug in how it prepares the list of bytes to send to the DAC.

import smbus


OUTPUT_MAX: int = 4095
V_MAX = 3.3


def send(output: float, channel: int = 1, device_address: int = 0x60, command_byte: int = 0x40):
    assert output > 0.0 and output <= 1.0, "Output voltage must be expressed as fraction of the maximum in the range [0.0, 1.0]"

    bus = smbus.SMBus(channel)

    output_bytes = int(output * OUTPUT_MAX) & 0xfff
    data_byte_0: int = (output_bytes & 0xff0) >> 4  # First data byte
    data_bytes: list[int] = [data_byte_0, (output_bytes & 0xf) << 4]  # Second data byte

    bus.write_i2c_block_data(device_address, command_byte, data_bytes)


if __name__ == "__main__":
    output: float = 0.42
    send(output)

    print(f"Estimated output: {output * V_MAX}")

Misc.

Basic Calculator bc

This is a command line calculator and can be used for hexadecimal, binary, and decimal conversions. Install with apt install bc.

# Convert 0x40 to binary
echo "ibase=16; obase=2; 40" | bc

# Convert 0x40 to decimal
echo "ibase=16; 40" | bc

Note that hexadecimal values must be uppercase, e.g. 0xC7, not 0xc7!

Persist Internet Connection Sharing after Reboot

In my previous post I wrote about how to use Microsoft's Internet Connection Sharing to share an internet connection on a Windows machine with a Raspberry Pi. Unfortunately, I learned that the ICS service settings do not persist after the Windows machine reboots, and as a result the ICS connection is lost.

The fix is explained in this Microsoft Learn page.

To fix the issue, add a key in the Windows registry with the following information:

  • Path: HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\SharedAccess
  • Type: DWORD
  • Setting: EnableRebootPersistConnection
  • Value: 1

I then had to reset the shared connection by unchecking and rechecking the boxes in the Sharing tab of the internet connection as explained previously. After a reboot, I confirmed that I could connect to the Pi without manually re-enabling ICS.

Internet Connection Sharing for Raspberry Pi Setups

Today I decided to set up an old Raspberry Pi 3B+ for a task in the lab. After burning the latest Raspberry Pi OS Lite image on the SD card, I booted it up and was faced with the unfortunately common problem of network access. It would have taken days to get IT to register the Pi's MAC address on our system, and I did not want to wait that long.

Luckily, I had a spare network crossover cable and an extra ethernet interface on my Windows work laptop, so I plugged the Pi directly into the laptop and enabled Microsoft Internet Connection Sharing (ICS) between the network connection through which I was connected to the internet and the connection to the Pi. In my specific example:

  1. Press the Windows key and navigate to View network connections
  2. Right click on my internet connection (Ethernet 2 in my case), select Properties..., and then the Sharing tab.
  3. Check Allow other network users to connect... and in the Home networking connection: dropdown, select the connection corresponding to the Pi (Ethernet in my case).
  4. Check Allow other network users to control.... I'm not sure whether this is necessary.

Click OK and restart the Pi if it's already connected. Once it restarts, it should now have internet access through the laptop.

Next I wanted to connect with SSH to the Pi from my laptop and I needed to know the Pi's IP address. Luckily, ICS uses the mshome.net domain name for the network, and the Raspberry Pi by default has the hostname raspberrypi. So getting the IP is as easy running the ping raspberrypi.mshome.net command in Powershell.

The Mono16 Format and Flir Cameras

For a long time I had found the Mono16 image format of Flir's cameras a bit strange. In the lab I have several Flir cameras with 12-bit ADC's, but the images they output in Mono16 would span a range from 0 to around 65535. How does the camera map a 12-bit number to a 16-bit number?

If you search for the Mono16 format you will find that it's a padded format. This means that, in the 12-bit ADC example, 4 bits in each pixel are always 0, and the remaining 12 bits represent the pixel's value. But this should mean that we should get pixel values only between 0 and 2^12 - 1, or 4095. So how is it that we can saturate one of these cameras with values near 65535?

Today it occurred to me that Flir's Mono16 format might not use all the values in the range [0, 65535]. This is indeed the case, as I show below with an image stack that I acquired from one of these cameras:

>>> sorted_unique_pixels = np.unique(images.ravel())
>>> np.unique(np.diff(sorted_unique_pixels))
array([ 16,  32,  48,  64,  96, 144], dtype=uint16)

This prints all the possible, unique differences between the sorted and flattened pixel values in my particular image stack. Notice how they are all multiples of 16?

Let's look also at the sorted array of unique values itself:

>>> sorted_unique_pixels
array([ 5808,  5824,  5856, ..., 57312, 57328, 57472], dtype=uint16)

There are more than a million pixels in this array, yet they all take values that are integer multiples of 16.

It looks like Flir's Mono16 format rescales the camera's output onto the interval [0, 65535] by introducing "gaps" between the numbers equal to 2^16 - 2^N where N is the bit-depth of the camera's ADC.

But wait just a moment. Above I said that 4 bits in the Mono16 are zero, but I assumed that these were the most significant bits. If the least significant bits are the zero padding, then the allowed pixel values would be, for example, 0000 0000 = 0, 0001 0000 = 16, 0010 0000 = 32, 0011 0000 = 48, etc. (Here I ignored the first 8 bits for clarity.)

So it appears that Flir is indeed padding the 12-bit ADC data with 0's in its Mono16 format. But, somewhat counter-intuitively, it is the four least significant bits that are the zero padding. I say this is counter-intuitive because I have another camera that pads the most significant bits, so that the maximum pixel value is really 2^N - 1, with N being the ADC's bit-depth.

Literature Review: An Optical Technique for Remote Focusing in Microscopy

Citation

E.J. Botcherby, R. Juškaitis, M.J. Booth, T. Wilson, "An optical technique for remote focusing in microscopy," Optics Communications, Volume 281, Issue 4, 2008, Pages 880-887

Abstract

We describe the theory of a new method of optical refocusing that is particularly relevant for confocal and multiphoton microscopy systems. This method avoids the spherical aberration that is common to other optical refocusing systems. We show that aberration-free refocusing can be achieved over an axial scan range of 70 μm for a 1.4 NA objective lens. As refocusing is implemented remotely from the specimen, this method enables high axial scan speeds without mechanical interference between the objective lens and the specimen.

Reasons for this Review

I am interested in this paper for two reasons:

  1. Recent advances in light sheet microscopy have made the theory of remote focusing more relevant than in the past.
  2. The paper presents a simplified theory of imaging by a high numerical aperture (NA) objective that is useful for understanding image formation in microscopes without resorting to the usual (and more complicated) Richards and Wolf description.

Problem Addressed by the Paper

The introduction lays out the reasons for this paper in a straightforward manner:

  • The primary bottleneck in 3D microscopy is axial scanning of the sample (what the authors call refocusing).
  • Due to fundamental optics, refocusing a high resolution microscope involves varying the objective/sample distance, i.e. the image plane must remain fixed.
  • It would be desirable to develop a simple mechanism whereby the objective or sample need not move to achieve refocusing in such microscopes without introducing unwanted aberrations.
    • This is because samples are becoming more complex (think embryos, organoids, etc.).
    • Adaptive optics to fix these aberrations would introduce too much complexity into the setup. (More on this later.)

Theory of 3D Imaging in Microscopes

The theory of 3D imaging is introduced by first considering a perfect imaging system with an object space refractive index of \( n_1 \) and an image space refractive index of \( n_2 \). Such a system transforms all the rays emanating from any point in the 3D object space to converge to a single point in the 3D image space. An image formed by such a system is known as a stigmatic image. Unfortunately, Maxwell, followed by Born and Wolf, showed that such a system is only possible if the magnification is the same in all directions and with magnitude

$$ \left| M \right| = \frac{n_1}{n_2} $$

This also implies that conjugate rays must have the same angle with respect to the optical axis.

$$ \gamma_2 = \pm \gamma_1 $$

Any system that does not meet these criteria is not a perfect imaging system. However, there exist some conditions whereby the system can create a perfect image if their requirements are satisfied. Under these conditions, a perfect image will be created only for objects of limited extent in the object space. The two conditions that are relevant for microscopy are

  1. the sine condition, and
  2. the Herschel condition.

Under the sine condition, points in a plane transverse to the optical axis are imaged perfectly onto the image plane; points that lie at some axial distance from the object plane suffer from spherical aberration and their images are not stigmatic. In some sense, the Herschel condition is the opposite: on-axis points are imaged stigmatically regardless of their axial position, but off-axis points suffer from aberrations.

The authors note the important fact that most microscope objectives are designed to satisfy the sine condition. As a result, the image plane must remain fixed so that aberration-free refocusing can only be achieved by varying the sample-objective distance. In the authors' words:

...it is possible to see why commercial microscopes, operating under the sine condition refocus by changing the distance between the specimen and objective, as any attempt to detect images away from the optimal image plane will lead to a degradation by spherical aberration.

Questions

  1. Does an ideal imaging system need only produce stigmatic images, or must it also accurately reproduce the relative positions between any pair of points in the image space (up to a proportionality factor)?
  2. What exactly are the defintions of the sine and Herschel conditions? Is it the equations relating the angles of conjugate rays? Is it based on the subset of the object space that is imaged stigmatically? Or, as we'll see in the next section, are they defined by the mapping of ray heights between principal surfaces? The authors present a few attributes of each condition, but I'm not certain which attributes serve as the definitions and which are consequences of their assumptions being true.

The General Pupil Function

I really liked this section. The authors present a model of a high NA microscope objective that is based on its principal surfaces. They then use a mix of scalar wave theory and ray tracing to explain why the sine condition produces stigmatic images for points near the axis in the focal plane of the objective. I think the value in this model is that it is much more approachable than the electromagnetic Richards and Wolf model for aplanatic systems.

To recall, the principal planes in paraxial optics are used to abstract away the details of a lens system. Refraction effectively occurs at these planes, and the focal length is measured relative to them. In non-paraxial systems, the principal planes actually become curved surfaces. Interestingly, most of the famous optics texts, such as Born and Wolf, are somewhat quiet about this fact, but it can be found in papers such as Mansuripur, Optics and Photonics News, 9, 56-60 (1998).

So a high NA objective is modeled as a pair of principal surfaces:

  1. The first is a sphere centered on the axis with a radius of curvature equal to the focal distance
  2. The second is a plane perpendicular to the axis, and they refer to it as the pupil plane

Another important thing to note is that these surfaces are not the usual reference spheres centered about object and image points and located in the entrance/exit pupils. I think the authors are right to use principal surfaces because many modern objectives are object-space telecentric, which places the entrance pupil at infinity. In this case the concept of a reference sphere sitting in the entrance pupil becomes a bit murky and I do not know whether it's applicable.

In any case, the authors compute the path length differences between points in the object space in this system and use the sine and Herschel conditions to map the rays from the object to the image space principal surfaces. (Each condition results in a different mapping.) Under the approximation that the extent of the object is small, the equations for the path length differences demonstrate what was stated in the previous section: that the sine condition leads to spherical aberration for points that do not lie in the focal plane of the objective. In fact, the phase profile of the wave (the authors weave between ray and wave optics) exiting the second principal plane is expanded as:

$$ znk \left[ 1 - \frac{\rho^2 \sin^2 \alpha}{2} + \frac{\rho^4 \sin^4 \alpha}{8} + \cdots \right] $$

For \( z = 0 \), i.e. the object is in the focal plane, all the terms disappear and we get a flat exit wave. When \( z \neq 0 \):

Focussing the tube lens is accurately described by the quadratic term, as it operates in the paraxial regime. Unfortunately the higher order terms which represent spherical aberrations cannot be focussed by the tube lens and consequently there is a breakdown of stigmatic imaging for these points.

In other words, under the sine condition, object points that are outside the focal plane produce curved, non-spherical wavefronts that cannot be focussed to a single point by a tube lens.

If, however, another lens in a reversed orientation was placed so that the curved wavefront from the objective was input into it, it would form a stigmatic image in its image space. This suggests a method for remote focussing.

Questions

  1. Is the second principal surface flat because the image is formed at infinity by a high NA, infinity-corrected objective? What would its radius of curvature be in a finite conjugate objective?
  2. Is the authors' pupil plane coplanar with the objective's exit pupil? Probably not; I think they're referring to the plane in which we find the objective's pupil function, which is somewhat standard (and confusing) nomenclature in microscopy.

A Technique for Remote Focusing

We arrive now at the crux of the paper. The authors suggest a setup for remote focusing that is free (within limits) of the spherical aberration that is introduced by objectives that satisfy the sine condition. Effectively they image the pupil from one objective onto the other with a 4f system. This ensures that the aberrated wavefront from the first objective is "unaberrated" by the second objective. Then, another microscope images the focal region of the second objective. 3D scanning is achieved by moving the objective of the second microscope (often called O3 in light sheet microscopes).

There are a few important points:

  • A 4f system needs to be used between the first (O1) and second (O2) objectives to relay the pupil because it faithfully maps the wavefront without adding any additional phase distortion.
  • On a related note, you can't use tube lenses in the 4f system that are not afocal with the objective. These so-called widefield tube lenses do not share a focal plane with the objective. The objective's pupil must be in the front focal plane of the 4f system.
  • The "perfect" imaging system of O1/4f system/O2 will have an isotropic magnification of \( n1 / n2 \). This satisfies Maxwell's requirement for 3D stigmatic imaging.
  • This approach will not work well for objectives that require specific tube lenses for aberration correction. (Sorry Zeiss.)
  • You will not lose resolution as long as the second objective has a higher angular aperture (not numerical aperture). You can, for example, use a NA 1.4 oil objective for O1 and a NA 0.95 dry objective for O2 because the O2 object space is in air, whereas the O1 object space is in oil with \( n \approx 1.5 \). From the definition of numerical aperture, the sine of the limiting angle of O1 must necessarily be smaller than the air objective.

At this point I found it amusing that the authors cited "complexity" as a reason for why their approach is superior to adaptive optics in the introduction of this paper.

Questions

  1. The authors suggest a different approach where a mirror is placed after O2 so that it also serves as O3 and use a beam splitter to direct the light leaving O2 onto a camera. Why don't light sheet microscopes use this setup? Is it because of a loss of photons due to the beam splitter?

Range of Operation

The equation for the path length difference between points in object space depends on the assumption of small object distances. This assumption places a limit on the range of validity of this approach. To quantify this limit, the authors computed the Strehl ratio of the phase of the wavefront in the pupil. Honestly, the calculations of this section look tedious. In the end, and after "some routine but rather protracted calculations, a simple result emerges." The simple result looks kind of ugly, depending, among other things on the sine to the eigth power of the aperture angle. It looks like the approach is valid for distances of several tens of microns on both sides of the focal plane of O1, which is in fact quite useful for many biological samples.

Ironically, the authors decide at this point that adaptive optics, the approach to remote focusing that is too complex, probably isn't that bad after all. It can be used to extend the range of validity of the authors' approach by correcting the higher order terms that are dropped in the binomial expansion for the optical path difference.

Summary

The authors go on to experimentally verify the approach in a rather unremarkable experiment of taking z-stacks of beads in two different setups. The PSF in their approach is much less aberrated than a normal widefield microscope over an axial range of about \( \pm 40 \mu m \).

Overall I quite like the paper because of its simplified theoretical model and clear explantion of the sine condition. I would argue, though, that the approach is not necessarily less complex than some of the alternatives that they rule out in the introduction. Admittedly, arguments over complexity are usually subjective and this doesn't necessarily mean the paper is of low quality. Given that many light sheet approaches are now based on this method, the paper serves as a good theoretical grounding into why remote focusing works and, in some cases, may be necessary.

Automated Testing of Simulation Code via Hypothesis Testing

Missing a Theory of Testing for Scientific Code

If you search the Internet for resources on the theory of testing code, you will find information about the different types of tests and how to write them. You will also find that it is generally accepted among programmers that good code is tested and bad code is not. The problem for scientists and engineers, however, is that the theory concerning the testing of computer code was developed primarily by programmers that work on systems that model business processes. There is little theory on how, for example, to test the outcome of physics simulations. To further exacerbate the problem, scientific programmers feel obliged to write tests without the guidance of such a theory because of the imperative to test their code. This leads to convoluted tests that are difficult to understand and maintain.

Scientific Code is Different

Code that models business processes is based on explicit rules that are developed from a set of requirements. An example of a rule that a business system might follow is "If a customer has ordered an item and has not paid, then send her an invoice."

To test the above rule, we write out all the possible cases and write a test for each one. For example:

  1. A customer orders an item without paying. Expected result: an invoice is sent.
  2. A customer orders an item and pays at the time of checkout: Expected result: no invoice is sent.

I have found that a good way to identify test cases in business logic is to look for if/else statements in a rule. Each branch of the statement should be a different test.

Now let's consider a physics simulation. I am an optical engineer, so I will use an example from optics. One thing I have often done in my work is to simulate the image formation process of a lens system, including the noise imparted by the camera. A simple model of a CMOS camera pixel is one that takes an input signal in photons, adds shot noise, converts it to photoelectrons, adds dark noise, and then converts the electron signal into analog-to-digital units. Schematically:

photons --> electrons --> ADUs

A simplified Python code snippet that models this process, including noise, is below. An instance of the camera class has a method called snap that takes input array of photons and converts it to ADUs.

from dataclasses import dataclass

import numpy as np


@dataclass
class Camera:
    baseline: int = 100  # ADU
    bit_depth: int = 12
    dark_noise: float = 6.83  # e-
    gain: float = 0.12  # ADU / e-
    quantum_efficiency: float = 0.76
    well_capacity: int = 32406  # e-
    rng: np.random.Generator = np.random.default_rng()

    def snap(self, signal):
        # Simulate shot noise and convert to electrons
        photoelectrons = self.rng.poisson(
            self.quantum_efficiency * signal, size=signal.shape
        )

        # Add dark noise
        electrons = (
            self.rng.normal(scale=self.dark_noise, size=photoelectrons.shape)
            + photoelectrons
        )

        # Clip to the well capacity to model electron saturation
        electrons = np.clip(electrons, 0, self.well_capacity)

        # Convert to ADU
        adu = electrons * self.gain + self.baseline

        # Clip to the bit depth to model ADU saturation
        adu = np.clip(adu, 0, 2 ** self.bit_depth - 1)

        return adu.astype(np.uint16)

How can we test this code? In this case, there are no if/else statements to help us identify test cases. Some possible solutions are:

  1. An expert can review it. But what if we don't have an expert? Or, if you are an expert, how do we know that we haven't made a mistake? I have worked professionally as both an optical and a software engineer and I can tell you that I make coding mistakes many times a day. And what if the simulation is thousands of lines of code? This solution, though useful, cannot be sufficient for testing.

  2. Compute what the results ought to be for a given set of inputs. Rules like "If the baseline is 100, and the bit depth is 12, etc., then the output is 542 ADU" are not that useful here because the output is random.

  3. Evaluate the code and manually check that it produces the desired results. This is similar to expert review. The problem with this approach is that you would need to recheck the code every time a change is made. One of the advantages of testing business logic is that the tests can be automated. It would be advantageous to preserve automation in testing scientific code.

  4. We could always fix the value of the seed for the random number generator to at least make the test deterministic, but then we would not know whether the variation in the simulation output is what we would expect from run-to-run. I'm also unsure whether the same seed produces the same results across different hardware architectures. Since the simulation is non-deterministic at its core, it would be nice to include this attribute within the test case.

Automated Testing of Simulation Results via Hypothesis Testing

The solution that I have found to the above-listed problems is derived from ideas that I learned in a class on quality control that I took in college. In short, we run the simulation a number of times and compute one or more statistics from the results. The statistics are compared to their theoretical values in a hypothesis test, and, if the result is outside of a given tolerance, the test fails. If the probability of failure is made small enough, then a failure of the test practically indicates an error in the simulation code rather than a random failure due to the stochastic output.

Theoretical Values for Test Statistics

In the example of a CMOS camera, both the theoretical mean and the variance of a pixel are known. The EMVA 1288 Linear Model states that

$$ \mu_y = K \left( \eta \mu_p + \mu_d \right) + B $$

where \( \mu_y \) is the mean ADU count, \( K \) is the gain, \( \eta \) is the quantum efficiency, \( \mu_p \) is the mean photon count, \( \mu_d \) is the mean dark noise, and \( B \) is the baseline value, i.e. the average ADU count under no illumination. Likewise, the variance of the pixel describes the noise:

$$ \sigma_y = \sqrt{K^2 \sigma_d^2 + \sigma_q^2 + K \left( \mu_y - B \right)} $$

where \( \sigma_y \) is the standard deviation of the ADU counts, \( \sigma_d^2 \) is the dark noise variance, and \( \sigma_q^2 = 1 / 12 \, \text{ADU} \) is the quantization noise, i.e. the noise from converting an analog voltage into discrete ADU values.

Hypothesis Testing

We can formulate a hypothesis test for each test statistic. The test for each is:

  1. Null hypothesis : the simulation statistics and the theoretical values are the same
  2. Alternative hypothesis : the simulation statistics and the theoretical values are different

Let's first focus on the mean pixel values. To perform this hypothesis test, I ran the simulation code a number of times. For convenience, I chose an input signal of 1000 photons. Here's the resulting histogram:

The mean of this distribution is 190.721 ADU and the standard deviation is 3.437 ADU. The theoretical values are 191.2 ADU and 3.420 ADU, respectively. Importantly, if I re-run the simulation, then I get a different histogram because the simulation's output is random.

The above histogram is called the sampling distribution of the mean, and its width is proportional to the standard error of the mean. (Edit 2024/05/30 Actually, I think I am wrong here. This is not the sampling distribution of the mean. To get it we would need to repeat the above experiment a number of times and compute the mean each time, much like I do in the following section. The set of all means from doing so would be its sampling distribution. Fortunately, the estimate of the confidence intervals in what follows should still hold because the sampling distribution of the mean tends to a normal distribution for large \(N \), and this allows for the expression in the equation that follows.)

Hypothesis Testing of the Mean Pixel Value

To perform the hypthosesis test on the mean, I build a confidence interval around the simulated value using the following formula:

$$ \mu_y \pm X \frac{s}{\sqrt{N}} $$

Here \( s \) is my estimated standard deviation (3.437 ADU in the example above), and \( N = 10,000 \) is the number of simulated values. Their ratio \( \frac{s}{\sqrt{N}} \) is an estimate of the standard error of the mean. \( X \) is a proportionality factor that is essentially a tolerance on how close the simulated value must be to the theoretical one to be considered "equal". A larger tolerance means that it is less likely that the hypothesis test will fail, but I am less certain that the value of the simulation is exactly equal to the theoretical value.

If this looks familiar, it should. In introductory statistics classes, this approach is called Student's one sample t-test. In the t-test, the value for \( X \) is denoted as \( t \) and depends on the desired confidence level and on the number of data points in the sample. (Strictly speaking, it's the number of data points minus 1.)

As far as I can tell there's no rule for selecting a value of \( X \); rather, it's a free parameter. I often choose 3. Why? Well, if the sampling distribution is approximately normally distributed, and the number of sample points is large, then the theoretical mean should lie within 3 standard errors of the simulated one approximately 99.7% of the time if the algorithm is correct. Alternatively, this means that a correct simulation will produce a result that is more than three standard errors from the theoretical mean about every 1 out of 370 test runs.

Hypothesis Testing of the Noise

Recall that standard deviation of pixel values is a measure of the noise. The approach to testing it remains the same as before. We write the confidence interval as

$$ \sigma_y \pm X \left( s.e. \right) $$

where we have \( s.e. \) as the standard error of the standard deviation. If the simulated standard deviation is outside this interval, then we reject the null hypothesis and fail the test.

Now, how do we calculate the standard error of the standard deviation? Unlike with the mean value, we have only one value for the standard deviation of the pixel values. Furthermore, there doesn't seem to be a simple formula for the standard error of the variance or standard error of the standard deviation. (I looked around the Math and Statistics Stack Exchanges, but what I did find produced standard errors that were way too large.)

Faced with this problem, I have two options:

  1. run the simulation a number of times to get a distribution of standard deviations
  2. draw pixel values from the existing simulation data with replacement to estimate the sampling distribution. This approach is known as bootstrapping.

In this situation, both are valid approaches because the simulation runs quite quickly. However, if the simulation is slow, bootstrapping might be desirable because resampling the simulated data is relatively fast.

I provide below a function that makes a bootstrap estimate of the standard error of pixel values to give you an idea of how this works. It draws n samples from the simulated pixel values with replacement and places the results in the rows of an array. Then, the standard devation of each row is computed. Finally, since the standard error is the standard deviation of the sampling distribution, the standard deviation of resampled standard deviations is computed and returned.

def se_std(data, n = 1000) -> float:
    samples = np.random.choice(data.ravel(), (n, data.size), replace=True)
    std_sampling_distribution = samples.std(axis=1)

    return np.std(std_sampling_distribution)

Of course, the value of n in the function above is arbitrary. From what I can tell, setting n to be the size of the data is somewhat standard practice.

Automated Hypothesis Testing

At this point, we can calculate the probability that the mean and standard deviation of the simulated pixel values will lie farther than some distance from their theoretical values. This means that we know roughly how often a test will fail due to pure luck.

To put these into an automated test function, we need only translate the two hypotheses into an assertion. The null hypothesis should correspond to the argument of the assertion being true; the alternative hypothesis corresponds to a false argument.

TOL = 3

def test_cmos_camera(camera):
    num_pixels = 32, 32
    mean_photons = 100
    photons = (mean_photons * np.ones(num_pixels)).astype(np.uint8)
    expected_mean = 191.2
    expected_std = 3.42

    img = camera.snap(photons)

    tol_mean = TOL * img.std() / np.sqrt(num_pixels[0] * num_pixels[1])
    tol_std = TOL * se_std(img)

    assert np.isclose(img.mean(), expected_mean, atol=tol_mean)
    assert np.isclose(img.std(), expected_std, atol=tol_std)

With a TOL value of 3 and with the sampling distributions being more-or-less normally distributed, each assertion should fail about 1 / 370 times because the area in the tails of the distribution beyond three standard errors is 1 / 370. We can put this test into our test suite and continuous integration (CI) system and run it automatically using whatever tools we wish, e.g. GitHub Actions and pytest.

Discussion

Non-deterministic Tests

It is an often-stated rule of thumb that automated tests should never fail randomly because it makes failures difficult to diagnose and makes you likely to ignore the tests. Here however it is in the very nature of this test that it will fail randomly from time to time. What are we to do?

An easy solution would be to isolate these sorts of tests and run them separately from the deterministic ones so that we know exactly where the error occurred. Then, if there is a failure of the non-deterministic tests, the CI could just run them again. If TOL is set so that a test failure is very rare, then any failure of these tests twice would practically indicate a failure of the algorithm to produce the theoretical results.

Testing Absolute Tolerances

It could be argued that what I presented here is a lot of work just to make an assertion that a simulation result is close to a known value. In other words, it's just a fancy way to test for absolute tolerances, and possibly is more complex than it needs to be. I can't say that I entirely disagree with this.

As an alternative, consider the following: if we run the simulation a few times we can get a sense of the variation in its output, and we can use these values to roughly set a tolerance that states by how much the simulated and theoretical results should differ. This is arguably faster than constructing the confidence intervals like we did above.

The value in the hypothesis testing approach is that you can know the probability of failure to a high degree of accuracy. Whether or not this is important probably depends on what you want to do, but it does provide you with a deeper understanding of the behavior of the simulation that might help debug difficult problems.

Testing for Other Types of Errors

There are certainly other problems in testing simulation code that are not covered here. The above approach won't tell you directly if you have entered an equation incorrectly. It also requires theoretical values for the summary statistics of the simulation's output. If you have a theory for these already, you might argue that a simulation would be superfluous.

If it's easy to implement automated tests for your simulation that are based on hypothesis testing, and if you expect the code to change often, then having a few of these sorts of tests will at least provide you a degree of confidence that everything is working as you expect as you make changes. And that is one of the goals of having automated tests: fearless refactoring.

Testing the Frequency of Failures

I stated often that with hypothesis testing we know how often the code should fail, but we never actually tested that. We could have run the simulation a large number of times and verified that the number of failures was approximately equal to the theoretical number of failures.

To my mind, it seems that this is just the exact same problem that was addressed above, but instead of testing summary statistics on the output values we test the number of failures. And since the number of failures will vary randomly, we would need a sampling distribution for this. So really this approach requires more CPU clock cycles to do the same thing because we need to run the simulation a large number of times.

Summary

  • Automated testing of simulation code is different than testing business logic due to its stochastic nature and inability to be reduced to "rules"
  • We can formulate hypothesis tests to determine how often the simulation produces values that are farther than a given distance from what theory predicts
  • The hypothesis tests can be translated into test cases: accepting the null hypothesis means the test passes, whereas rejecting the null hypothesis means the test fails
  • Non-deterministic testing is useful when it is quick to implement and you expect to change the code often

French Vocabulary for Machinists

I work in a French-speaking country and frequently need to communicate with our machinists, many of whom do not speak English.

Here is a list of English-French vocabulary words that I have found useful. I will update it as I learn more words.

Last update: 2024-04-18

Materials and processing

  • aluminum, n : aluminium (m)
  • anodized, adj : anodisé
  • stainless steel, n : 1. acier inoxydable, 2. inox (the cool way to say it)

Measurements

  • dimensions, npl : les dimensions (fpl)

Screws, bolts, fasteners, etc.

  • latch, n : un loquet (possibly Swiss-French)
    • push latch, n : un loquet poussoir
  • screw, n : une vis
    • cap (or head) screw, n : une vis à tête
    • countersunk cap screw, n : vis à tête fraisée
  • spring, n : un ressort
  • threading, n : un filetage
    • exterior threads : 1. filetage extérieur, 2. filetages mâles
    • interior threads : 1. filetage intérieur, 2. filetages femelles

Tools

  • die, n : une filière
  • lathe, n : un tour
  • mill, milling machine, n : une fraiseuse
  • tap, n : un taraud

Engineering Fits

I have been working on some optomechanical parts that require a hole-and-shaft style mating. During their design, I realized I really didn't have any theoretical background on how big the holes and shafts should be so that they fit together. This lead me to do some basic research into engineering fits.

Engineering Fits

According to Building Scientific Apparatus, 4th ed.1, fit should be specified when the absolute size of two mating parts is not important, but the clearance between them is critical.

To understand fits, it helps first to think in terms of active surfaces and tolerances.

An active surface is a region where two surfaces touch and either move against each other or have a static fit 2. (Interestingly, an active surface is really two physical surfaces by this definition.) The tolerances on the size of two mating parts determines the type of fit. An example of the tolerances on a hole-and-shaft assembly is shown below.

Tolerance Ranges

Fit definitions

In this context, we can define three types of fits:

  1. Clearance fits : Tolerance zones do not overlap
  2. Transition fits : Tolerance zones partially overlap
  3. Interference fits : Tolerance zones fully overlap

These fits exist on a continuum and are not neatly distinguished in practice. The continuum can be seen by plotting the force required for mating vs. the allowance. The allowance in this context can be defined as follows3:

\[ \text{allowance} = \text{smallest hole} - \text{largest shaft} \]

Clearance fits

  1. Sliding fit : Some lateral play
  2. Running fit : More fricition, but more accurate motion

Transition fits

  1. Keyring fit : Slight force required for mating and easy to remove
  2. Push fit : More force required; possible to remove by hand

Interference fits

  1. Force fit : Hand tools likely required for mating
  2. Press fit : Requires more force, likely using a press

A Simple Object-Space Telecentric System

Object-space telecentricity

I have been working on a software package recently for optical systems design. The process of building the package has proceeded like this:

  1. Think of a particular case that I want to model; for example an infinite conjugate afocal system
  2. Implement it in the code
  3. Discover that the code doesn't work
  4. Create a test case that helps debug the code
  5. Repeat

I am modeling a telecentric lens in the current iteration of this loop. To keep things simple, I am limiting myself to an object-space telecentric system. This was more challenging than I expected. In part, the reason is that I was trying to infer whether a system was or was not telecentric from the lens prescription data and a ray trace, which has two problems:

  1. I need to do a floating point comparison between two numbers to say whether a system is telecentric. Either the chief ray angle in object-space has to be zero or the entrance pupil must be located at infinity. Floating point comparisons are notoriously difficult to get right, and if you're doing them then you might want to rethink what you're trying to model.
  2. Numerous checks are needed before we can even trace any rays. For example, I should check first whether the user placed the object at infinity. This would form the image in the same plane as the aperture stop, which does not really make sense.

I find it interesting that Zemax addresses these problems by introducing object-space telecentricity as an extra boolean flag that forces the chief ray angle to be zero in the object-space. In other words, the user needs to know what they're doing and to specify that they want telecentricity from the beginning.

An object-space telecentric example

I adapted the following example from lens data presented in this video: https://www.youtube.com/watch?v=JfstTsuNAz0. Notably, the object distance was increased by nearly a factor of two from what was given in the video so that the image plane was at a finite distance from the lens. Paraxial ray trace results were computed by hand.

A simple object-space telecentric system comprising a planoconvex lens and a stop.
Surface 0 1 2 3 4
Comment OBJ STOP IMG
\( R \) \( \infty \) -9.750
\( t \) 29.4702 2 15.97699 17.323380
\( n \) 1 1.610248 1 1
\( C \) 0 -0.10256
\( -\Phi \) 0 -0.06259
\( t/n \) 29.4702 1.24204 15.97699 17.323380
\( y \) 0 29.4702 30.712240 15.97699 0
\( nu \) 1 1 -0.922279 -0.922279
\( \bar{y} \) 1 1 1 0 -1.084270
\( n \bar{u} \) 0 0 -0.06259 -0.06259

This system is shown below with lens semi-diameters of 5 mm. Note that the stop is at the paraxial focus of the lens. The rays in the sketch cross the axis before the stop because of spherical aberration.

Remarks

Marginal ray trace

At first the marginal ray trace was a bit confusing because the entrance pupil is at infinity. How can the marginal ray, which intersects the pupil at its edge, be traced when the pupil is at infinity? Then I remembered that I don't aim for the edge of the pupil when tracing the marginal ray. Instead, I launch a ray from the axis in the object plane at a random angle taking the surface with the smallest ray height as the aperture stop. (I chose a paraxial angle of 1 in the table above. Technically, this is called a pseudo-marginal ray. The real marginal ray is calculated from it by rescaling the surface intersection heights by the aperture stop semi-diameter.) Once you have the marginal ray in image space, just find its intersection with the axis to determine the image location.

Telecentric lens design

So how would an object-space telecentric design be implemented in software? First, I'd set an option that would force the chief ray angle to 0 in the object space. Then, I'd simply place a solve on the aperture stop that puts it at the location where the chief ray intersects the axis.