Published onJanuary 24, 2025Feed Forward Network ParametersllmMultiple figures exploring patterns within the feed forward network weight matrices.
Published onJanuary 23, 2025Logit Evolution - How the model decidesllmMultiple figures exploring how the model's prediction of the next token evolves through the layers of the model.
Published onJanuary 22, 2025Attention Matrix PlotsllmInteresting plots examining properties of the model parameters in the attention block.
Published onJanuary 21, 2025Attention Focus PlotsllmInteresting plots showing how the transformer attention mechanism behaves.
Published onJanuary 20, 2025Experiment SetupllmAn overview of the setup I use for the research. Which model, dataset, ect...