symlik: Symbolic Likelihood Models in Python
Define statistical models symbolically and automatically derive score functions, Hessians, and Fisher information.
Research engineer and computer scientist working at the intersection of machine learning, statistical computing, and cryptography. Currently pursuing my PhD in Computer Science at Southern Illinois University Edwardsville, where I also earned dual master's degrees in Computer Science and Mathematics/Statistics. I believe in the power of open source to accelerate scientific progress. My work focuses on building tools that are both theoretically sound and practically useful—whether it's compositional approaches to language modeling, encrypted search systems that preserve privacy, or statistical methods for reliability analysis with censored data. I maintain 50+ open source repositories with libraries published to PyPI and other package registries. I care deeply about API design, documentation, and developer experience—making complex algorithms accessible to practitioners. As a cancer survivor, I bring a unique perspective on resilience and determination to my work, approaching challenges with both theoretical depth and practical engineering expertise.
I've released symlik, a Python library for symbolic likelihood models. The core idea: write your log-likelihood as a symbolic expression, and let the computer derive everything needed for inference.
The Problem
Traditional statistical computing requires either:
- Manual derivation - Work out score functions and information matrices by hand, then implement them
- Numerical approximation - Use finite differences, which can be unstable and slow
Both approaches have drawbacks. Manual derivation is error-prone and tedious. Numerical methods accumulate errors and don't give you the symbolic form.
The Solution
symlik takes a third approach: symbolic differentiation. Define your model once, get exact derivatives automatically.
from symlik.distributions import exponential
model = exponential()
data = {'x': [1.2, 0.8, 2.1, 1.5]}
mle, _ = model.mle(data=data, init={'lambda': 1.0})
se = model.se(mle, data)
print(f"Rate: {mle['lambda']:.3f} +/- {se['lambda']:.3f}")
# Rate: 0.714 +/- 0.357
Behind the scenes, symlik:
- Symbolically differentiates the log-likelihood to get the score function
- Differentiates again for the Hessian
- Computes Fisher information from the Hessian
- Derives standard errors from the inverse information matrix
All exact. No numerical approximation.
Custom Models
The real power is defining custom models using s-expressions:
from symlik import LikelihoodModel
# Exponential: l(lambda) = sum[log(lambda) - lambda*x_i]
log_lik = ['sum', 'i', ['len', 'x'],
['+', ['log', 'lambda'],
['*', -1, ['*', 'lambda', ['@', 'x', 'i']]]]]
model = LikelihoodModel(log_lik, params=['lambda'])
# Symbolic derivatives available
score = model.score() # Gradient
hess = model.hessian() # Hessian matrix
info = model.information() # Fisher information
Heterogeneous Data
One of symlik's strengths is handling mixed observation types - exactly what you need for reliability analysis with censored data:
from symlik import ContributionModel
from symlik.contributions import complete_exponential, right_censored_exponential
model = ContributionModel(
params=["lambda"],
type_column="status",
contributions={
"observed": complete_exponential(),
"censored": right_censored_exponential(),
}
)
Each observation type contributes differently to the likelihood. symlik handles the bookkeeping.
Installation
pip install symlik
Documentation at queelius.github.io/symlik.
Originally published at metafunctor.com