Enhance Cost Efficiency in Domain Adaptation with PruneMe

May 16, 2024 17 min Free

MLOps World - MLOps World & Generative AI World 2024

domain-adaptation continual-pretraining ai-research llm large-language-models pruning cost-efficiency model-optimization transformer nlp

Description

This talk introduces PruneMe, an open-source repository that implements a layer pruning technique for Large Language Models (LLMs). The technique, inspired by research on the ineffectiveness of deeper layers, aims to enhance cost efficiency in domain adaptation. By removing redundant layers, PruneMe facilitates continual pre-training on streamlined models, which can then be merged into larger, more performant models using techniques like Evolve Merging. This approach offers a cost-effective strategy for optimizing and adapting LLMs for specific domains.

Up Next

42 min

Efficiently Fine-Tune And Serve Your Own LLMs

MLOps World - MLOps World & Generative AI World 2024

Alex Sherstinsky

llm-fine-tuning predibase ludwig lorax large-language-models lora parameter-efficient-fine-tuning peft transformer-models mistral-7b model-serving inference

47 min

LLM Fine-Tuning for Modern AI Teams: How One E-Commerce Unicorn Cut Inference Cost by 90%

MLOps World - MLOps World & Generative AI World 2024

Emmanuel Turlay

inference-cost data-preparation mistral-7b gpt-3.5 cost-reduction llm fine-tuning ai machine-learning e-commerce natural-language-processing model-evaluation

29 min

llm large-language-models ai machine-learning mlops training gpu kubernetes python tensorflow pytorch kernels optimization memory-management transformer

32 min

Memory Optimizations for Machine Learning

MLOps World - MLOps World & Generative AI World 2024

Tejas Chopra

model-pruning neural-networks cpu data-quantization machine-learning llm memory-optimization quantization inference deep-learning transformer-models gpu

Back to Home

Enhance Cost Efficiency in Domain Adaptation with PruneMe

Description

Up Next

Efficiently Fine-Tune And Serve Your Own LLMs

LLM Fine-Tuning for Modern AI Teams: How One E-Commerce Unicorn Cut Inference Cost by 90%

On-Device ML for LLMs: Post-Training Optimization Techniques with T5 and Beyond

A Practical Guide to Efficient AI

Large Language Model Training and Serving at LinkedIn

Memory Optimizations for Machine Learning