Original): Hideya
Originally published in the direction of artificial intelligence.
Introduction
Large language models (LLM) are evolving at an amazing pace – it gets wiser, faster and cheaper Almost every month. Even recently ChatgPT version, “GPT-OS”, “GPT-OS” appeared.
At the same time, MCP servers (model Context Protocol) They matured into powerful, practical tools. In fact, in the case of many routine tasks, even small, inexpensive LLM can effectively support the connection of the MCP server.
It led me to the question:
👉 Which MCP tasks can be powered by cheap models? How do the performance, costs and speed compare themselves with various suppliers?
To facilitate experiments, I created command line tool This allows you to quickly test different MCP servers with different LLM.
Yes – including GPT-OSS with free access level!
👉 And if you've ever fied GoogleGenerativeAIFetchError: 400 Bad Request
When trying Gemini with MCP servers using TypeScript – don't worry. I will overcome how this tool works around this problem in the last section.
Why is this tool?
During experimenting, I often wanted to know something like:
- Can the McP task work well GPT-5-Mini OPENAI (USD 0.25 / USD for million tokens)?
- Or even on Ultra Feach GPT-5-NANO (0.05 USD / 0.40 USD)?
- How about Gemini 2.5 Flash-LiteKnown for performance, low cost and free amount?
- Or lightning Cerebry + GPT-OS-120Bwith a capacity to 3000 tokens per second?
This tool allows you to compare all these scenarios with a simple configuration file.
How it works
By writing a configuration file, you can easily run the same set of queries against various MCP servers (e.g. concept, github) and compare answers between LLM.
Here is a simplified example in JSON5 (supported comments):
{
"llm": {
"provider": "openai", "model": "gpt-5-mini"
// "provider": "anthropic", "model": "claude-3-5-haiku-latest"
// "provider": "google_genai", "model": "gemini-2.5-flash"
// "provider": "xai", "model": "grok-3-mini"
// "provider": "cerebras", "model": "gpt-oss-120b"
// "provider": "groq", "model": "openai/gpt-oss-20b"
},
"mcp_servers": {
"notion": { // use "mcp-remote" to access the remote MCP server
"command": "npx",
"args": ("-y", "mcp-remote", "https://mcp.notion.com/mcp")
},
"github": { // can be accessed directly if no OAuth reqquired
"type": "http",
"url": "https://api.githubcopilot.com/mcp",
"headers": { "Authorization": "Bearer ${GITHUB_PERSONAL_ACCESS_TOKEN}" }
}
},
"example_queries": (
"Tell me about my Notion account",
"Tell me about my GitHub profile"
)
}
Key functions:
- Json5 format → Supports comments and end commas (not like JSON)
- Environmental variables → Avoid API keys with fuels (
${ENVIRONMENT_VARIABLE_NAME}
will be replaced by its value)
Supported LLM suppliers
- Openai
- Anthropic
- Google Gemini (not Vertex AI)
- XAI
- Brain (For its speed and operation of GPT-OS-120b)
- Soil (For its speed and operation of GPT-OS-20B/120B)
👉 NOTE: The output format is only text (Other answers ignored).
Installation
Two versions are available:
- NPM version (TypeScript) – Requires node.js 18+
npm install -g @h1deya/mcp-client-cli
- PIP version (Python) – Requires Python 3.11+
pip install mcp-chat
Start from:
mcp-client-cli
(NPM version)mcp-chat
(PIP version)
Tool start
Before jumping into real MCP servers, such as concept or github, I recommend starting with Minimum sandbox configuration. In this way, you can confirm that the tool works in your environment without worrying about API or Oauth tokens.
Simple configuration
Here is the basic configuration that connects to two local MCP servers:
- MCP filesystem → LLM allows you to read/save local files in a specific directory
- Fetch MCP → LLM allows you to download websites
{
"llm": {
"provider": "openai", "model": "gpt-5-mini"
// "provider": "anthropic", "model": "claude-3-5-haiku-latest"
// "provider": "google_genai", "model": "gemini-2.5-flash"
// "provider": "xai", "model": "grok-3-mini"
// "provider": "cerebras", "model": "gpt-oss-120b"
// "provider": "groq", "model": "openai/gpt-oss-20b"
},"example_queries": (
"Explain how an LLM works in a few sentences",
"Read file 'llm_mcp_config.json5' and summarize its contents",
"Summarize the top headline on bbc.com"
),
"mcp_servers": {
"filesystem": {
"command": "npx",
"args": (
"-y",
"@modelcontextprotocol/server-filesystem",
"." // Can only manipulate files under the specified directory
)
},
"fetch": {
"command": "uvx",
"args": ("mcp-server-fetch")
}
}
}
👉 This configuration is The best place to start. It does not require external API tokens and gives the impression of how LLM interacts with MCP servers.
- Save the above configuration as
llm_mcp_config.json5
. - Add API keys in
.env
File if necessary:ANTHROPIC_API_KEY=sk-ant-…
OPENAI_API_KEY=sk-proj-…
GOOGLE_API_KEY=AI…
XAI_API_KEY=xai-…
CEREBRAS_API_KEY=csk-…
GROQ_API_KEY=gsk_… - Run the tool from a directory containing the two above -mentioned files.
mcp-client-cli
mcp-chat
Below is an example of a console output when starting the tool:
% mcp-client-cli
Initializing model... { provider: 'cerebras', model: 'gpt-oss-120b' } Initializing 2 MCP server(s)...
Writing MCP server log file: mcp-server-filesystem.log
Writing MCP server log file: mcp-server-fetch.log
(info) MCP server "filesystem": initializing with: {"command":"npx","args":("-y","@modelcontextprotocol/server-filesystem","."),"stderr":14}
(info) MCP server "fetch": initializing with: {"command":"uvx","args":("mcp-server-fetch"),"stderr":16}
(info) MCP server "fetch": connected
(info) MCP server "fetch": 1 tool(s) available:
(info) - fetch
(MCP Server Log: "filesystem") Secure MCP Filesystem Server running on stdio
(info) MCP server "filesystem": connected
(MCP Server Log: "filesystem") Client does not support MCP Roots, using allowed directories set from server args: ( '/Users/hideya/.../mcp-chat-test' )
(info) MCP server "filesystem": 14 tool(s) available:
(info) - read_file
︙
︙
(info) - list_allowed_directories
(info) MCP servers initialized: 15 tool(s) available in total
Conversation started. Type 'quit' or 'q to end the conversation.
Exaample Queries (just type Enter to supply them one by one):
- Explain how an LLM works in a few sentences
- Read file 'llm_mcp_config.json5' and summarize its contents
- Summarize the top headline on bbc.com
Query: █
You will see initialization journals, MCP connections, and then a prompt in which you can introduce queries – or recreate sample queries to test many combinations of MCP and LLM servers (the first sample query serves as a control of LLM behavior without involving MCP). It should be remembered that local MCP server diary files are saved in the current directory (the Fetch server does not write diaries, so the content is empty).
Use --help
The option to find out how to change the name of the configuration file to the reading, and the directory in which the diary files are saved.
Free access to GPT-OSS
Both Brain AND Soil Provide free levels for GPT-Ad Models and are supported by this tool:
Benchmark performance (from August 2025):
- GPT-5: ~ 200 tokens/s
- Cerebry + GPT-OS-120B: 3000 tokens/s 🤯
- GROQ + GPT-OSS-20B: ~ 1000 tokens/s.
- GROQ + GPT-OS-120B: ~ 500 tokens/s.
Configuring the API account and key is simple and does not require a credit card.
Advanced notes
- Implementation: It is a MCP customer developed from Langchain.js. Transforms MCP tools into Langchain tools using a custom light adapter (Npmjs / Dump). The implementation of the work Agent React Langgraph.
- Gemini's problem: Gemini + langchain.js official MCP adapters sometimes break due to the strict rules of the JSON scheme, throwing
400 Bad Request
. Version NPMmcp-client-cli
It bypasses this with the transformation of the scheme in the custom MCP adapter. Python SDK does not face this problem. Detailed information can be found in the next section. - Differences in the NPM version vs. Pip:
– NPM (mcp-client-cli
) → Local MCP server diaries will also print on the console.
– PIP (mcp-chat
) → Diaries only to the file, and Python's messages can be difficult.
⚠️ Advanced: repair gemini + langchain.js + mcp compatibility – avoid 400 Bad Request
Failures
Skip this quite long and technical section.
If you've ever tried to use Google Gemini together with Langchain.js AND MCP servers with complex schemesPerhaps you encountered this mistake:
(GoogleGenerativeAI Error): Error fetching from
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent:
(400 Bad Request) Invalid JSON payload received.
Unknown name "anyOf" at ...
This message often repeats dozens of times, and then the request will fail.
If you were looking for GoogleGenerativeAIFetchError: (GoogleGenerativeAI Error) 400 Bad Request
This section explains the cause and way of bypassing it when using Langchain (you can avoid it if you use Google Vertex AI).
Why is this happening
- The requirements for the Gemini scheme are very harsh.
MCP servers define their tools using flexible JSON schemes. Most LLM accept these in order. - But twins Rejects the correct MCP tool schemes If they contain fields that it does not expect (e.g. use
anyOf
). - The result is 400 bad request – Although the same MCP server works well with OpenAI, Antropic or XAI.
- Google provides a correction in the new SDK Gemini (
@google/genai
), but Langchain.js cannot use it because of the architectural disposal.
👉 For many programmers, this can make the use of Gemini from Langchain.js and some MCP servers. Even if only one complex MCP server is contained in the MCP definitions transferred MultiservermcpclientAll subsequent use of the MCP begins not to inform with the above error.
How this tool fixes it
This command line interface (CLI) automatically transforms MCP tool diagrams into a format compatible with Gemini Before sending them. This process is carried out by non -standard light MCP tool converterwhich this application uses internally.
If you prefer to see the behavior of a raw error, you can turn off this correction by setting:
{
"schema_transformations": false,
"llm": {
"provider": "google_genai", "model": "gemini-2.5-flash"
}
...
}
If you are Gemini user frustrated with diagram errorsThis function can allow you to test these unsuccessful MCP servers.
If your programming environment requires repair and you want to use the same bypass for langchain.js, consider use The same MCP tool converter Until the problem is set in official SDK or on MCP servers.
Application
This command line tool is facilitated by:
- Experiment with various LLM and MCP servers
- Efficiency, cost and speed
- Try open source/parameters models such as GPT-Adeven for free
If you study the quickly evolving LLM + MCP ecosystem, I hope that it will help you tinker, compare and discover configurations that best check your projects. 🙏✨
Published via AI