Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
We analyze popular Stable Diffusion models using representational similarity and norms. Our findings reveal three phenomena: (1) the presence of a learned positional embedding in intermediate representations, (2) high-similarity corner artifacts, and (3) anomalous high-norm artifacts.
Ring Neural Networks
An experimental neural network architecture where weights and activations are angles on a ring instead of cartesian coordinates, naturally represented by integers with overflow. Neurons rotate their inputs and aggregate them as unit vectors, replacing dot products. Includes a custom fixed-point autograd and a CUDA-accelerated PyTorch implementation.
Recalibrating Pythia from RoPE to PoPE
We patch pretrained Pythia models to use Polar Coordinate Positional Embeddings (PoPE) instead of RoPE and recalibrate on ~2% of the pretraining budget. After recalibration, PoPE matches RoPE perplexity at the training sequence length while generalizing much better to longer contexts.
DroPE Replication with Pythia
A replication of DroPE with Pythia models: rotary positional embeddings (RoPE) are patched to a no-op, followed by recalibration on The Pile for ~2% of the pretraining budget. While recalibration doesn't fully recover the original perplexity, models without RoPE generalize notably better to longer contexts.
SD Representation Similarity Explorer
An advanced interactive visualization tool for exploring representation similarities in text-to-image diffusion models. Expanding on the capabilities of the H-Space Similarity Explorer, this project offers additional features for understanding diffusion model representations.
sdhelper
A Python helper package for working with Stable Diffusion models that enables easy extraction of U-Net and transformer representations. sdhelper provides a simple interface to load models, generate images, and analyze internal representations, supporting various models including SD1.x, SD2.x, SDXL, and FLUX.
Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models
An unofficial implementation of the paper "Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models". This project explores and visualizes meaningful directions in the latent space of diffusion models.
Blog: Offline RL with Diversified Q-Ensemble
An in-depth exploration of state-of-the-art approaches in offline reinforcement learning. This blog post analyzes SAC-N and EDAC algorithms, focusing on their innovative use of multiple critics to address the critical challenge of action-value overestimation in offline RL settings.
Spatiotemporal modeling of first and second wave outbreak dynamics of COVID-19 in Germany
In this paper, we model the spatiotemporal dynamics of COVID-19 in Germany using a reparameterized SIQRD network model. It accurately predicts county-level infections and deaths, helping to identify effective measures and support local decision-making during the pandemic.
LinkedIn