Run Llama Without a GPU! Quantized LLM with LLMWare and Quantized Dragonby@shanglun
1,742 reads
1,742 reads

Run Llama Without a GPU! Quantized LLM with LLMWare and Quantized Dragon

by Shanglun Wang12mJanuary 7th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow
EN

Too Long; Didn't Read

As GPU resources become more constrained, miniaturization and specialist LLMs are slowly gaining prominence. Today we explore quantization, a cutting-edge miniaturization technique that allows us to run high-parameter models without specialized hardware.

People Mentioned

Mention Thumbnail

Company Mentioned

Mention Thumbnail
featured image - Run Llama Without a GPU! Quantized LLM with LLMWare and Quantized Dragon
Shanglun Wang HackerNoon profile picture
Shanglun Wang

Shanglun Wang

@shanglun

Quant, technologist, occasional economist, cat lover, and tango organizer.

STORY’S CREDIBILITY

Original Reporting

Original Reporting

This story contains new, firsthand information uncovered by the writer.

Share Your Thoughts

About Author

Shanglun Wang HackerNoon profile picture
Shanglun Wang@shanglun
Quant, technologist, occasional economist, cat lover, and tango organizer.

TOPICS

Languages

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
L O A D I N G
. . . comments & more!