Visualize Any Hugging Face Model

41 points by rippeltippel 3 days ago | 7 comments

aesthesia
This is a neat idea. When I'm looking up models I usually want to see something about the architecture, but also some of the hyperparameters for the specific model---residual dimension, total number of layers, tokenizer configs. There's some of that in the visualization but it's spotty.
The results for Nemotron 3 Nano are hard to parse, and I think actually incorrect: https://hfviewer.com/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-B... I'm guessing this is because the implementation uses layers that are all instances of the same class, with forward passes that branch on the layer type specified at construction time.
- hibijibies
  Hi, I'm from Embedl (embedl.com / https://huggingface.co/embedl) and we made the hfviewer. Could you please elaborate more on why the Nemotron model visualization might be incorrect? A number of passes are performed to get the graph structure from the HuggingFace conf including sometimes exporting the model with torch.export and the recombining it to make the view meaningful. We would love to fix any issues and make the viewer better.
  aesthesia
  The Nemotron model has attention layers interspersed with the Mamba layers, and I didn't see any attention layers in the model. It looks like the attention layers are present but show up as blocks with an RMSNorm followed by two sequential linear layers. The first few resolution levels aren't very useful either.
TheKingOfEmbedl
Cool beans, granularity slider is such a nice touch. Creds to my teammate from embedl who made this.
mikljohansson
Wow, looks so nice! I'm going to use the widget for my model cards
anavid7
Where is it capturing the model "structure" from?
- danbrooks
  Most hugging face models are implemented in PyTorch, with an architecture specified as a series of layers. This looks like a nice visualization of that.