Type DeepSeek, then select the one you want* and click download
It should then show in your models
*When deciding which one to choose look at Params - the bigger the model the beefier the machine you’ll need. I downloaded the 7B model and it runs fine on my Mac.
You can then start a new chat and at the top select DeepSeek.
I was able to run the 32B model on a 4090. It was pretty responsive. I would love to get my hands on a bunch of Mac Mini M4 Pro to compare the 4090 with a cluster of these.
I didn’t tried the smaller models, so I can’t really compare these with the 32B. But I tried the 70B model also and it was slightly better than the 32B, but it was too slow for me to be able to run it for anything serious. (I suppose a 64GB Mac Mini M4 Max should be able to run the 70B model.)
I don’t have a benchmark to compare or even a way to measure the accuracy of the answers, this is why I’ve included only the timings and memory usage, at least you can get an idea of the resources used by each model.
As mentioned, the 70b model does not fit my GPU, so I didn’t measure it.