How much does Phi 4 Multimodal Instruct cost?

Phi 4 Multimodal Instruct is open source and can be self-hosted.

What benchmarks has Phi 4 Multimodal Instruct been tested on?

Phi 4 Multimodal Instruct has been evaluated on 1 benchmarks. Top scores: Artificial Analysis — Quality Index: 10.0.

Is Phi 4 Multimodal Instruct open source?

Yes, Phi 4 Multimodal Instruct is open source.

How does Phi 4 Multimodal Instruct compare to Qwen VL Max?

Phi 4 Multimodal Instruct has an average score of 0.0 while Qwen VL Max scores 0.0. Qwen VL Max slightly outperforms Phi 4 Multimodal Instruct overall. See full comparison →

Home/Models/Phi 4 Multimodal Instruct

Phi 4 Multimodal Instruct

Name: Phi 4 Multimodal Instruct
Author: Microsoft

by Microsoft · Released Feb 2025

Open Source

Compare

Context

N/A

Input $/1M

TBD

Output $/1M

TBD

Type

automatic-speech-recognition

License

Open Source

Benchmarks

1 tested

Data updated today

About

Microsoft automatic speech recognition model. 308K downloads on HuggingFace.

Tested on 1 benchmarks with 0.0% average. Top scores: Artificial Analysis — Quality Index (10.0%).

Capabilities

speed

16.7

#57 globally

Benchmark Scores

Compare All

Tested on 1 benchmarks · Ranked across 1 categories

Score Distribution (all 231 models)

0255075100

speedCompare speed →

Artificial Analysis — Quality Index

Artificial Analysis Quality Index. Composite quality score combining multiple benchmark results into a single metric.

10.0—

Quick compare:

vs Qwen VL Max

vs Elephant

vs Trinity Large Thinking

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · Microsoft Phi 4

Phi 4Jan 2025

43.2

$0.07/M in16Kctx16 benchmarks

Phi 4 Mini InstructFeb 2025

29.4-13.8

N/AN/Actx9 benchmarks

Phi 4 Multimodal InstructFeb 2025

N/AN/Actx1 benchmark

See the full Phi 4 family →

Similar Models

Trinity Large Thinking

arcee-ai

0.0$0.22/1M

Links

Info

Research

Documentation

Community

Source Code

BenchGecko API

microsoft-phi-4-multimodal-instruct

Specifications

Typeautomatic-speech-recognition
ContextN/A
ReleasedFeb 2025
LicenseOpen Source
StatusActive

Available On

MicrosoftTBD

Frequently Asked Questions

Phi 4 Multimodal Instruct is an open-source automatic-speech-recognition AI model by Microsoft, released in February 2025.

Benchmarks

Artificial Analysis — Quality Index

Microsoft · Provider All Models Compare Models

Phi 4 Multimodal Instruct

Frequently Asked Questions

Related Models

Benchmarks

Related Pages