Projects Portfolio

Welcome to my curated collection of selected projects. Below, hover over each card for extra details.

Note: This page is currently under development and more projects will be added soon!

Fake comics generation

Fake comics generation

MLLM, VLM, LLM, Computer Vision, NLP

Custom comics storyline and panel generation matching Western style of Tex Willer comics.

Key technologies

  • Multimodal language models
  • Synthetic data generation
  • Fine-tuning
  • Diffusion

Fake comics generation

MLLM, VLM, LLM, Computer Vision, NLP

Goals

  • Generate custom story-lines based on user input details in the style of Western comics
  • Generate panels (images) with same style, setting and characters of the original comic, but using the custom storyline

Features

  • End-to-end project, from data collection to deployment on my hardware
  • Use of image-text-to-text technology
  • Use of text-to-image technology

Challenges and considerations

  • Must run on consumer hardware
  • Very limited hand-labeled data (to be created by me)
  • Matching original characters behavior in the generated story may be tricky

Work in Progress: check the link below to find out more!

Learn more
BayesCART: a Python package for efficient Bayesian CART model search

BayesCART: a Python package for efficient Bayesian CART model search

Python package development, Object oriented programming, Bayesian statistics, Decision trees

A Python package for Bayesian Classification and Regression Trees (CART) with advanced Markov Chain Monte Carlo (MCMC) sampling. Designed for modularity, efficiency, and extensibility, it follows best practices of modern, high-performance machine learning software.

BayesCART: a Python package for efficient Bayesian CART model search

Python package development, Object oriented programming, Bayesian statistics, Decision trees

Features

  • Advanced ML Algorithms: Tackles Bayesian CART multimodality using custom MCMC samplers
  • Efficient software design. Uses modular OOP principles, parallelization, and caching
  • Robust engineering practices. Comprehensive documentation, type hints, and automated CI workflows
  • Scalability and extensibility. Large datasets and subclassing support
  • Parallelism and performance: Supports parallel execution for scalable inference.
  • Optimized memory usage: Lightweight copy mechanisms reduces memory overhead

Documentation and Testing

  • Sphinx Documentation
  • Type Annotations
  • Automated Testing & CI workflows
Learn more
Tempered Stochastic Search of Bayesian CART Models

Tempered Stochastic Search of Bayesian CART Models

Bayesian CART, Decision trees, Multimodal posterior, Optimization, MCMC

Master research thesis on Bayesian Classification and Regression Trees (CART) models.

Developed custom posterior sampling MCMC techniques to address multimodality and local optima.

Outcome: enables the solution of new problem classes

Tempered Stochastic Search of Bayesian CART Models

Bayesian CART, Decision trees, Multimodal posterior, Optimization, MCMC

Challenges

  • Sharply peaked multimodal posterior landscape
  • Traditional MCMC sampler get stuck in local optima
  • Standard posterior flattening samples overfit trees
  • Wrong splits in large trees are nearly impossible to undo

Solutions

  • Use tempering to encourage mode jumping
  • Introduce pseudo-prior to force maximum tree depth during exploration
  • Biasing towards small trees yielding a bottom-up constructive tree search

Outcomes

  • Orders of magnitude faster convergence
  • Drastically improved multimodal posterior exploration
  • Accurate posterior probability estimates on smaller problems

Limitations

  • The space of trees is combinatorially bin in the data size
  • Extensive exploration is still tricky in complex problems
Learn more