Paragraph by paragraph analysis and tutorial with Python

Paragraph by paragraph analysis and tutorial with Python

Paragraph 1: Dose-response curves in pharmacology

image

Summary: This paragraph introduces the concept of dose-response curves in pharmacology. When the dosage of a drug is plotted on a logarithmic scale against its effect (usually measured as the percentage of subjects affected), the resulting curve often takes on a symmetrical S-shape, known as a sigmoidal curve. Researchers have traditionally used the cumulative distribution function (CDF) of the normal distribution to model this relationship and estimate drug potency.

Intuition: Think of the dose-response curve as representing the variation in how different individuals react to a drug. Some people are very sensitive and respond to low doses, while others require higher doses. When plotted on a log scale, this variation often approximates a normal distribution.

Python code to visualize a typical dose-response curve using the normal CDF:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

def dose_response(x, ec50, hill):
    return norm.cdf(np.log10(x) - np.log10(ec50), scale=1/hill)

doses = np.logspace(-2, 2, 100)
ec50 = 1  # dose producing 50% effect
hill = 2  # steepness of the curve

response = dose_response(doses, ec50, hill)

plt.semilogx(doses, response)
plt.xlabel('Dose')
plt.ylabel('Response')
plt.title('Dose-Response Curve')
plt.grid(True)
plt.show()
image

Paragraph 2: Introducing the Logstic function

image

Summary: The author introduces an alternative function for modeling dose-response relationships, known as the logistic function. This function has been used in various fields under different names, such as the "growth function" or "autocatalytic curve." It gained popularity in statistics after Pearl and Reed applied it to population growth models. The author chooses to use the term "logistic function" due to its widespread use following Pearl and Reed's work.

Intuition: The logistic function is a versatile S-shaped curve that can model many phenomena involving growth or transition between two states. Its flexibility and simplicity make it attractive for various applications, including dose-response modeling.

Python code to visualize the logistic function:

import numpy as np
import matplotlib.pyplot as plt

def logistic(x, L, k, x0):
    return L / (1 + np.exp(-k * (x - x0)))

x = np.linspace(-10, 10, 100)
L = 1  # maximum value
k = 1  # steepness
x0 = 0  # midpoint

y = logistic(x, L, k, x0)

plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Logistic Function')
plt.grid(True)
plt.show()
image

Paragraph 3: Normal distribution

image

Summary: This paragraph explains the theoretical basis for using the normal distribution in dose-response modeling. The idea is that individual susceptibility to a drug follows a normal distribution. If we assume that the minimal lethal dose measures susceptibility, then the proportion of individuals affected by a given dose is represented by the cumulative distribution function (CDF) of the normal distribution up to that dose.

The normal distribution is widely used to model biological traits, and there is experimental evidence supporting its use for susceptibility distributions. This theoretical and empirical support makes the normal CDF an attractive choice for dose-response modeling.

Mathematically, if XX represents susceptibility and follows a normal distribution N(μ,σ2)N(\mu, \sigma^2), then the proportion of individuals affected by dose xx is given by:

P(Xx)=x1σ2πe(tμ)22σ2dtP(X \leq x) = \int_{-\infty}^x \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(t-\mu)^2}{2\sigma^2}} dt

Python code to illustrate this concept:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Generate a normal distribution of susceptibilities
susceptibilities = np.random.normal(loc=5, scale=1, size=10000)

# Calculate the proportion affected at different doses
doses = np.linspace(0, 10, 100)
proportion_affected = [np.mean(susceptibilities <= dose) for dose in doses]

# Plot the results
plt.plot(doses, proportion_affected)
plt.xlabel('Dose')
plt.ylabel('Proportion Affected')
plt.title('Dose-Response Curve (Normal Distribution)')
plt.grid(True)

# Add the theoretical normal CDF
plt.plot(doses, norm.cdf(doses, loc=5, scale=1), 'r--', label='Theoretical CDF')
plt.legend()
plt.show()
image

Paragraph 4: Logistic vs. Normal Representation

image
image

Summary: The author argues that the logistic function has several advantages over the normal CDF for dose-response modeling:

  1. It closely approximates the normal CDF, providing similar results.
  2. It applies to a wide range of physicochemical phenomena, potentially offering a stronger theoretical foundation.
  3. It may be easier to handle statistically.

Given these potential benefits, the author proposes a comparative examination of both functions.

Intuition: The logistic function and normal CDF are both S-shaped curves, but the logistic function may be more flexible and easier to work with in certain contexts. By comparing them directly, we can assess their relative merits for dose-response modeling.

Python code to compare the logistic function and normal CDF:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.special import expit  # logistic function

x = np.linspace(-5, 5, 100)
normal_cdf = norm.cdf(x)
logistic = expit(x)

plt.plot(x, normal_cdf, label='Normal CDF')
plt.plot(x, logistic, label='Logistic')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Comparison of Normal CDF and Logistic Function')
plt.legend()
plt.grid(True)
plt.show()

# Calculate the maximum difference
max_diff = np.max(np.abs(normal_cdf - logistic))
print(f"Maximum difference: {max_diff:.4f}")
image

Paragraph 5: justifying the Logistic vs. Normal CDF

image

Summary: To compare the logistic function and the normal CDF, the author collected multiple datasets of dose-mortality observations that met two criteria:

  1. They had a sufficient number of observations.
  2. They had already been analyzed using the normal CDF.

The author then applied the logistic function to these datasets and compared the results. For the normal CDF analysis, the author used maximum likelihood estimates provided by the original studies when available, or calculated them using the same method when not provided.

Intuition: This approach allows for a fair comparison between the two methods, as it uses real-world data that has already been analyzed with the normal CDF. By applying the logistic function to the same datasets, direct comparisons can be made in terms of fit and ease of use.

Python code to demonstrate the concept of fitting both functions to data:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.optimize import curve_fit

# Generate synthetic dose-response data
np.random.seed(0)
doses = np.logspace(-1, 2, 20)
responses = norm.cdf(np.log10(doses), loc=1, scale=0.5) + np.random.normal(0, 0.05, 20)

# Define functions to fit
def normal_cdf(x, mu, sigma):
    return norm.cdf(np.log10(x), loc=mu, scale=sigma)

def logistic(x, L, k, x0):
    return L / (1 + np.exp(-k * (np.log10(x) - x0)))

# Fit both functions
popt_norm, _ = curve_fit(normal_cdf, doses, responses)
popt_logistic, _ = curve_fit(logistic, doses, responses, p0=[1, 1, 1])

# Plot results
plt.semilogx(doses, responses, 'ko', label='Data')
x_plot = np.logspace(-1, 2, 100)
plt.semilogx(x_plot, normal_cdf(x_plot, *popt_norm), 'r-', label='Normal CDF fit')
plt.semilogx(x_plot, logistic(x_plot, *popt_logistic), 'b--', label='Logistic fit')
plt.xlabel('Dose')
plt.ylabel('Response')
plt.legend()
plt.grid(True)
plt.show()
image

Paragraph 6: Fitting The Logistic

image

Summary: The author considers different methods for fitting the logistic function to the data, with a focus on whether to use the maximum likelihood method. This method has been used in various specific contexts and has been advocated for general use by prominent statisticians like R. A. Fisher and E. B. Wilson.

Intuition: The choice of fitting method is crucial as it can affect the results and interpretation of the analysis. Maximum likelihood estimation is a popular and powerful technique that finds the parameter values that make the observed data most probable under the assumed model.

Python code demonstrating maximum likelihood estimation for the logistic function:

import numpy as np
from scipy.optimize import minimize

def logistic(x, L, k, x0):
    return L / (1 + np.exp(-k * (x - x0)))

def negative_log_likelihood(params, x, y):
    L, k, x0 = params
    p = logistic(x, L, k, x0)
    # Avoid log(0) by clipping the probabilities
    p = np.clip(p, 1e-10, 1 - 1e-10)
    return -np.sum(y * np.log(p) + (1 - y) * np.log(1 - p))

# Generate synthetic data
np.random.seed(0)
x = np.linspace(-5, 5, 100)
true_params = [1, 1, 0]
y = logistic(x, *true_params) + np.random.normal(0, 0.1, 100)

# Perform maximum likelihood estimation with bounds
initial_guess = [1, 1, 0]
bounds = [(0.1, 10), (0.1, 10), (-5, 5)]
result = minimize(negative_log_likelihood, initial_guess, args=(x, y), bounds=bounds)

print("True parameters:", true_params)
print("Estimated parameters:", result.x)
True parameters: [1, 1, 0]
Estimated parameters: [ 1.03176494e+00  9.90544487e-01 -9.23055117e-04]

Paragraph 7: MLE vs. LS

image

Summary: The author expresses skepticism about the superiority of the maximum likelihood method over other fitting methods, particularly least squares. He then explains the logistic function and its parameters:

  1. The logistic function has two parameters, aa and β\beta
  2. The LD50 (lethal dose for 50% of the population) is given by a/βa/\beta.

The author then explains the principle of maximum likelihood:

  1. Choose parameter estimates that maximize the probability of observing the given data.
  2. If parameter values a1,b1a_1, b_1 give a higher probability of observing the data than a2,b2a_2, b_2, then a1,b1a_1, b_1 have greater likelihood.
  3. The maximum likelihood estimates are found by maximizing the joint probability (or its logarithm) of all observations, typically by differentiating with respect to the parameters and solving for the maximum.

Intuition: Maximum likelihood tries to find the parameter values that make the observed data most probable. It's like adjusting the parameters of a coin-flipping model to make the observed sequence of heads and tails most likely.

Paragraph 8: Bias of MLE

image

Summary: The author discusses the advantages and potential drawbacks of the maximum likelihood method:

Advantages:

  1. It's intuitively appealing.
  2. It uses principles from inverse probability.
  3. It has broad applicability across different problems.

Drawbacks:

  1. It can conflict with other well-established statistical principles.
  2. It can produce biased parameter estimates in some cases.
  3. When maximum likelihood and least squares give different results, it's unclear which method should be used for assessing goodness-of-fit using the chi-square distribution.
  4. The author suspects that maximum likelihood estimates may have larger standard errors than least squares estimates.

Intuition: While maximum likelihood is a powerful and widely-used method, it's not always the best choice. Different estimation methods can have trade-offs between bias, variance, and computational complexity.

Python code to compare maximum likelihood and linear least squares estimation:

import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

def logistic(x, k, x0):
    return 1 / (1 + np.exp(-k * (x - x0)))

def logit(p):
    return np.log(p / (1 - p))

def negative_log_likelihood(params, x, y):
    k, x0 = params
    p = logistic(x, k, x0)
    p = np.clip(p, 1e-10, 1 - 1e-10)  # Clip probabilities to avoid log(0)
    return -np.sum(y * np.log(p) + (1 - y) * np.log(1 - p))

# Monte Carlo simulation parameters
num_simulations = 1000
true_params = np.array([1, 0])
x = np.linspace(-5, 5, 100)
SNR_dB = np.arange(0, 25, 5)
SNR_linear = 10 ** (SNR_dB / 10)

mse_mle = np.zeros((len(SNR_dB), 2))
mse_ls = np.zeros((len(SNR_dB), 2))
bias_mle = np.zeros((len(SNR_dB), 2))
bias_ls = np.zeros((len(SNR_dB), 2))

# Perform Monte Carlo simulation
for snr_idx, snr in enumerate(SNR_linear):
    for _ in range(num_simulations):
        # Generate synthetic data
        np.random.seed()
        signal_power = np.var(logistic(x, *true_params))
        noise_power = signal_power / snr
        y = logistic(x, *true_params) + np.random.normal(0, np.sqrt(noise_power), 100)
        y = np.clip(y, 1e-10, 1 - 1e-10)  # Avoid log(0) issues
        
        # Maximum Likelihood Estimation with random initial guess
        initial_guess = [np.random.uniform(0.1, 2), np.random.uniform(-1, 1)]
        bounds = [(0.1, 10), (-5, 5)]
        ml_result = minimize(negative_log_likelihood, initial_guess, args=(x, y), bounds=bounds)
        
        # Linear Least Squares Estimation
        y_logit = logit(y)
        A = np.vstack([x, np.ones_like(x)]).T
        params_ls, _, _, _ = np.linalg.lstsq(A, y_logit, rcond=None)
        k_ls = params_ls[0]
        x0_ls = -params_ls[1] / k_ls
        
        # Compute bias and MSE for MLE
        bias_mle[snr_idx] += (ml_result.x - true_params)
        mse_mle[snr_idx] += (ml_result.x - true_params) ** 2
        
        # Compute bias and MSE for Linear LSE
        linear_ls_params = np.array([k_ls, x0_ls])
        bias_ls[snr_idx] += (linear_ls_params - true_params)
        mse_ls[snr_idx] += (linear_ls_params - true_params) ** 2
    
    # Average bias and MSE over all simulations
    bias_mle[snr_idx] /= num_simulations
    mse_mle[snr_idx] /= num_simulations
    bias_ls[snr_idx] /= num_simulations
    mse_ls[snr_idx] /= num_simulations

# Plot results
plt.figure(figsize=(14, 6))

plt.subplot(1, 2, 1)
plt.plot(SNR_dB, mse_mle[:, 0], 'r-o', label='MLE k')
plt.plot(SNR_dB, mse_ls[:, 0], 'b-o', label='LS k')
plt.plot(SNR_dB, mse_mle[:, 1], 'r--o', label='MLE x0')
plt.plot(SNR_dB, mse_ls[:, 1], 'b--o', label='LS x0')
plt.xlabel('SNR (dB)')
plt.ylabel('MSE')
plt.title('Mean Squared Error vs SNR')
plt.legend()
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(SNR_dB, bias_mle[:, 0], 'r-o', label='MLE k')
plt.plot(SNR_dB, bias_ls[:, 0], 'b-o', label='LS k')
plt.plot(SNR_dB, bias_mle[:, 1], 'r--o', label='MLE x0')
plt.plot(SNR_dB, bias_ls[:, 1], 'b--o', label='LS x0')
plt.xlabel('SNR (dB)')
plt.ylabel('Bias')
plt.title('Bias vs SNR')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()
import numpy as np
from scipy.optimize import minimize, curve_fit

def logistic(x, L, k, x0):
    return L / (1 + np.exp(-k * (x - x0)))

def negative_log_likelihood(params, x, y):
    L, k, x0 = params
    p = logistic(x, L, k, x0)
    return -np.sum(y * np.log(p) + (1 - y) * np.log(1 - p))

# Generate synthetic data
np.random.seed(0)
x = np.linspace(-5, 5, 100)
true_params = [1, 1, 0]
y = logistic(x, *true_params) + np.random.normal(0, 0.1, 100)

# Maximum Likelihood Estimation
ml_result = minimize(negative_log_likelihood, [1, 1, 0], args=(x, y))

# Least Squares Estimation
ls_result, _ = curve_fit(logistic, x, y, p0=[1, 1, 0])

print("True parameters:", true_params)
print("ML estimates:", ml_result.x)
print("LS estimates:", ls_result)
Note that the “logit linearization” fails miserably when there is some noise.
Note that the “logit linearization” fails miserably when there is some noise.

Now let’s assume high SNR and see how the results change when we go from 10 measurements to 100

import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

def logistic(x, k, x0):
    return 1 / (1 + np.exp(-k * (x - x0)))

def logit(p):
    return np.log(p / (1 - p))

def negative_log_likelihood(params, x, y):
    k, x0 = params
    p = logistic(x, k, x0)
    p = np.clip(p, 1e-10, 1 - 1e-10)  # Clip probabilities to avoid log(0)
    return -np.sum(y * np.log(p) + (1 - y) * np.log(1 - p))

# Monte Carlo simulation parameters
num_simulations = 1000
true_params = np.array([1, 0])
sample_sizes = np.arange(10, 510, 10)

mse_mle = np.zeros((len(sample_sizes), 2))
mse_ls = np.zeros((len(sample_sizes), 2))
bias_mle = np.zeros((len(sample_sizes), 2))
bias_ls = np.zeros((len(sample_sizes), 2))

# Perform Monte Carlo simulation
for size_idx, n_samples in enumerate(sample_sizes):
    for _ in range(num_simulations):
        # Generate synthetic data without noise
        np.random.seed()
        x = np.linspace(-5, 5, n_samples)
        y = logistic(x, *true_params)
        y = np.clip(y, 1e-10, 1 - 1e-10)  # Avoid log(0) issues
        
        # Maximum Likelihood Estimation with random initial guess
        initial_guess = [np.random.uniform(0.1, 2), np.random.uniform(-1, 1)]
        bounds = [(0.1, 10), (-5, 5)]
        ml_result = minimize(negative_log_likelihood, initial_guess, args=(x, y), bounds=bounds)
        
        # Linear Least Squares Estimation
        y_logit = logit(y)
        A = np.vstack([x, np.ones_like(x)]).T
        params_ls, _, _, _ = np.linalg.lstsq(A, y_logit, rcond=None)
        k_ls = params_ls[0]
        x0_ls = -params_ls[1] / k_ls
        
        # Compute bias and MSE for MLE
        bias_mle[size_idx] += (ml_result.x - true_params)
        mse_mle[size_idx] += (ml_result.x - true_params) ** 2
        
        # Compute bias and MSE for Linear LSE
        linear_ls_params = np.array([k_ls, x0_ls])
        bias_ls[size_idx] += (linear_ls_params - true_params)
        mse_ls[size_idx] += (linear_ls_params - true_params) ** 2
    
    # Average bias and MSE over all simulations
    bias_mle[size_idx] /= num_simulations
    mse_mle[size_idx] /= num_simulations
    bias_ls[size_idx] /= num_simulations
    mse_ls[size_idx] /= num_simulations

# Plot results
plt.figure(figsize=(14, 6))

plt.subplot(1, 2, 1)
plt.plot(sample_sizes, mse_mle[:, 0], 'r-o', label='MLE k')
plt.plot(sample_sizes, mse_ls[:, 0], 'b-o', label='LS k')
plt.plot(sample_sizes, mse_mle[:, 1], 'r--o', label='MLE x0')
plt.plot(sample_sizes, mse_ls[:, 1], 'b--o', label='LS x0')
plt.xlabel('Number of Samples')
plt.ylabel('MSE')
plt.title('Mean Squared Error vs Number of Samples')
plt.legend()
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(sample_sizes, bias_mle[:, 0], 'r-o', label='MLE k')
plt.plot(sample_sizes, bias_ls[:, 0], 'b-o', label='LS k')
plt.plot(sample_sizes, bias_mle[:, 1], 'r--o', label='MLE x0')
plt.plot(sample_sizes, bias_ls[:, 1], 'b--o', label='LS x0')
plt.xlabel('Number of Samples')
plt.ylabel('Bias')
plt.title('Bias vs Number of Samples')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()
It seems that the bias of the LS Method is better
It seems that the bias of the LS Method is better

Paragraph 9: Likelihood function

image

Summary: This paragraph introduces the likelihood function for the logistic model in a dose-response context:

  1. QiQ_i is the mortality rate at dose xix_i, and Pi=1QiP_i = 1 - Q_i is the survival rate.
  2. The probability of observing mim_i deaths out of nin_i exposed subjects at dose xix_i follows a binomial distribution:
  3. where CiC_i is the binomial coefficient.

  4. The likelihood of all observations is the product of these probabilities.
  5. The log-likelihood is given by:

Intuition: The likelihood function quantifies how probable the observed data is under the assumed model. By maximizing this function, we find the parameter values that make the observed data most likely.

Python code to calculate the log-likelihood for a logistic dose-response model:

import numpy as np
from scipy.special import comb

def logistic(x, L, k, x0):
    return L / (1 + np.exp(-k * (x - x0)))

def log_likelihood(params, x, m, n):
    L, k, x0 = params
    Q = logistic(x, L, k, x0)
    P = 1 - Q

    log_C = np.sum([np.log(comb(n_i, m_i)) for n_i, m_i in zip(n, m)])
    log_Q_term = np.sum(m * np.log(Q))
    log_P_term = np.sum((n - m) * np.log(P))

    return log_C + log_Q_term + log_P_term

# Example usage
x = np.array([1, 2, 3, 4, 5])
m = np.array([2, 5, 8, 12, 15])
n = np.array([20, 20, 20, 20, 20])
params = [1, 1, 3]  # L, k, x0

ll = log_likelihood(params, x, m, n)
print(f"Log-likelihood: {ll}")
Log-likelihood: -10.438202688532435

Paragraph 10: Derivation of the MLE

image

Summary: This paragraph derives the conditions (through derivatives) for maximum likelihood estimation of the logistic function parameters:

  1. The partial derivatives of the log-likelihood with respect to parameters aa and bb are given by:
  2. Setting these derivatives to zero yields the maximum likelihood conditions:

Intuition: These conditions ensure that the fitted model matches the observed data in terms of the total number of deaths and the "weighted" total (by dose) number of deaths.

Python code to implement these conditions for maximum likelihood estimation:

import numpy as np
from scipy.optimize import fsolve
from scipy.stats import bernoulli

def logistic(x, a, b):
    return 1 / (1 + np.exp(-(a + b*x)))

def ml_equations(params, x, n, q):
    a, b = params
    Q = logistic(x, a, b)
    eq1 = np.sum(n * Q) - np.sum(n * q)
    eq2 = np.sum(n * x * Q) - np.sum(n * x * q)
    return [eq1, eq2]

# Generate real logistic data
np.random.seed(42)
x = np.array([1, 2, 3, 4, 5])
true_a, true_b = -2, 1  # True parameter values
true_prob = logistic(x, true_a, true_b)
n = np.array([50, 50, 50, 50, 50])  # 50 trials for each x value

# Generate successes for each probability
successes = np.array([np.random.binomial(n_i, p) for n_i, p in zip(n, true_prob)])
q = successes / n

print("Generated data:")
print(f"x: {x}")
print(f"n: {n}")
print(f"q: {q}")

# Solve for maximum likelihood estimates
initial_guess = [0, 1]
a_ml, b_ml = fsolve(ml_equations, initial_guess, args=(x, n, q))

print(f"\nTrue parameters: a = {true_a}, b = {true_b}")
print(f"ML estimates: a = {a_ml:.4f}, b = {b_ml:.4f}")

x: [1 2 3 4 5]
n: [50 50 50 50 50]
q: [0.24 0.62 0.7  0.88 0.98]
True parameters: a = -2, b = 1
ML estimates: a = -1.9916, b = 1.0429

Paragraph 11: Weighted Least Squares

image

Summary: This paragraph introduces the least squares method for fitting the logistic function and explains some challenges:

  1. The least squares principle aims to minimize Wi(qiQi)2\sum W_i(q_i - Q_i)^2, where WiW_i is a weight inversely proportional to the variance of qiq_i.
  2. The variance of qiq_i is given by PiQi/niP_iQ_i/n_i.
  3. Direct solution for least squares is not possible because: a. The logistic function is not linear in its parameters. b. The weights contain unknown quantities PQPQ.
  4. A solution can be obtained through Taylor series expansion and successive approximation, but this is computationally intensive.

Paragraph 12: Logit Transformation (linearization)

image

Summary: This paragraph introduces a transformation that simplifies the least squares problem: This is logit linearization scheme we introduced above in the code

  1. Instead of working with qiq_i, we use the logit transformation: li=ln(pi/qi)l_i = \ln(p_i/q_i).
  2. For small differences, (qiQi)2(q_i - Q_i)^2 can be approximated by: a. (PiQi)2(liLi)2(P_iQ_i)^2(l_i - L_i)^2 or better, b. (PiQi)(liLi)2(P_iQ_i)(l_i - L_i)^2
  3. Using this approximation, the weighted sum of squares becomes:
χ2=niPiQi(qiQi)2=ni(Piqi)(liLi)2\chi^2 = \sum \frac{n_i}{P_iQ_i} (q_i - Q_i)^2 = \sum n_i(P_iq_i)(l_i - L_i)^2

Intuition: The logit transformation linearizes the problem and allows for a simpler formulation of the least squares objective. This makes the optimization easier to solve. Now it can be solved with linear least squares in one iteration.

Python code demonstrating the logit transformation and the approximation:

import numpy as np

def logit(p):
    return np.log(p / (1 - p))

def logistic(x):
    return 1 / (1 + np.exp(-x))

# Generate some example data
np.random.seed(0)
X = np.linspace(-5, 5, 100)
q = logistic(X) + np.random.normal(0, 0.05, 100)
q = np.clip(q, 0.001, 0.999)  # Avoid log(0) issues

# Calculate logits
l = logit(q)
L = X  # True logits

# Compare different approximations
diff_sq = (q - logistic(X))**2
approx1 = (q * (1-q))**2 * (l - L)**2
approx2 = (q * (1-q)) * (l - L)**2

print("Mean squared difference:", np.mean(diff_sq))
print("Mean of approximation 1:", np.mean(approx1))
print("Mean of approximation 2:", np.mean(approx2))
Mean squared difference: 0.002040809945120254
Mean of approximation 1: 0.004312336471942783
Mean of approximation 2: 0.04490201600367433

Paragraph 13: (The Transformed Weights for WLS)

image

Summary: This paragraph highlights the advantages of using the logit transformation for least squares estimation of the logistic function as well as for the weights for weighted least squares as shown in (5) below:

  1. The quantity to be minimized is χ2=niPiQi(qiQi)2\chi^2 = \sum \frac{n_i}{P_iQ_i} (q_i - Q_i)^2.
  2. Using the approximation (11.1), this becomes equivalent to minimizing ni(piqi)(liLi)2\sum n_i(p_iq_i)(l_i - L_i)^2.
  3. The weights in this formulation (nipiqin_i p_i q_i) depend only on observed quantities, not on fitted values.
  4. This simplification is unique to the logistic function and doesn't occur for other functions like the exponential or normal CDF.
  5. As a result, a least squares solution can be obtained directly without iterative approximation.
  6. The problem reduces to a weighted linear regression of LL on xx, with weights Wi=niPiqiW_i = n_iP_iq_i.

Intuition: The logit transformation turns a complex nonlinear problem into a simpler linear one, making it much easier to solve and avoiding the need for iterative procedures.

Paragraph 14: (Weights Continued)

image
image

Summary: This paragraph outlines the practical steps for fitting the logistic function using the logit transformation and weighted least squares:

  1. For each observation qiq_i at dose xix_i, calculate the logit Li=ln(1qiqi)L_i = \ln(\frac{1-q_i}{q_i}).
  2. The author provides a chart (Chart II) for convenient logit calculation.
  3. Perform a weighted least squares regression of LiL_i on xix_i, using weights Wi=niPiqiW_i = n_iP_iq_i.
  4. This approach closely approximates a least squares solution for the original logistic function.

Intuition: This method transforms the problem into a weighted linear regression, which is much simpler to solve than the original nonlinear problem. The weights account for the varying precision of different observations.

Python code demonstrating this weighted logit regression approach:

Paragraph 15: (Handling corner cases):

image

Summary: This paragraph addresses a practical issue in logit regression: handling observations with 0% or 100% mortality rates:

  1. Logits are undefined for probabilities of 0 or 1 (as log(0) is undefined).
  2. To handle this, the author suggests: a. Obtain a preliminary solution by omitting these extreme observations. b. Use this preliminary solution to estimate the expected value for these doses. c. Replace the extreme observation with a value halfway between the estimate and the actual observation (0 or 1).

Intuition: This approach allows us to include information from these extreme observations without letting them unduly influence the fit or cause computational issues. In the code above we fixed this problem like that:

y = np.clip(y, 1e-10, 1 - 1e-10)  # Avoid log(0) issues

Paragraph 16: χ2\chi^2 Test

image

Summary: This paragraph presents the results of comparing the logistic function to the normal curve (probit model) for various datasets:

  1. The comparison is based on chi-square (χ2\chi^2) values, which measure the goodness of fit.
  2. The results show that: a. In many cases, the logistic and normal curves perform similarly. b. When there is a difference, the logistic function tends to perform better. c. The normal curve never outperformed the logistic function in these examples.
  3. The most significant advantage for the logistic function was observed in the Murray male fly series, which had the largest number of observations.

Intuition: While the logistic and normal curves are similar in shape, the logistic function seems to be more flexible in fitting real-world dose-response data, especially when more data points are available.

Logistic chi-square: 89.60439530837866
Probit chi-square: 481.77319536890474
Logistic MSE: 9.780056532060929e-05
Probit MSE: 0.000141788446123677

Paragraph 17: Finale of the Paper

image

Summary: This final paragraph offers some concluding thoughts:

  1. The author suggests that with larger datasets, the logistic function might consistently outperform the normal curve.
  2. However, the difference in performance is not expected to be large, as both curves are similar in shape.
  3. The author believes that fitting the logistic function using the method described in the paper is simpler than fitting the normal curve using probits and maximum likelihood.

Intuition: The logistic and normal curves are very similar, especially in their central regions. The preference for the logistic function seems to be driven more by computational convenience than by substantial differences in fit quality.

Python code to visualize the similarity between logistic and normal curves:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

def logistic(x):
    return 1 / (1 + np.exp(-x))

x = np.linspace(-5, 5, 1000)
y_logistic = logistic(x)
y_normal = norm.cdf(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y_logistic, label='Logistic')
plt.plot(x, y_normal, label='Normal CDF', linestyle='--')
plt.title('Comparison of Logistic and Normal Cumulative Distribution Functions')
plt.xlabel('x')
plt.ylabel('Probability')
plt.legend()
plt.grid(True)
plt.show()

# Calculate maximum difference
max_diff = np.max(np.abs(y_logistic - y_normal))
print(f"Maximum difference between curves: {max_diff:.4f}")
Maximum difference between curves: 0.1174

This code shows the logistic and normal cumulative distribution functions, highlighting their similarity and calculating the maximum difference between them.