Posts ::

同时用 Claude Code、Copilot、Qwen、Codex，个人 skill 该怎么组织？

2026-04-22

#ai #skills #claude-code #github-copilot #qwen-code #codex

一篇问题驱动的笔记：从多 agent 工作流为什么会变乱开始，拆开 always-on 规则和 skill 的边界、Claude Code 的 skill 加载时机、跨 agent 共享源文件的取舍，以及每家 agent 的落地方式。

[Read more]

微服务契约共享的 Tradeoff：从 Monorepo 到 Polyrepo，该共享到哪一步

2026-04-22

#microservice #architecture #contract #polyrepo #monorepo #openapi #spring-boot

BFF 与微服务之间到底该共享到哪一步？本文从 Shop Platform 的 monorepo 实践出发，拆解共享 contracts、共享 client、契约测试三种思路在 polyrepo 下的 tradeoff，并记录我目前更倾向的落地方式。

[Read more]

ArchUnit 作为 Code Agent 时代的 Harness：微服务、Monorepo 与普通 Repo 的落地方式

2026-04-21

#archunit #java #microservice #monorepo #architecture #code-agent

在 code agent 普及的背景下，我在 Shop Platform 用 ArchUnit 作为可执行的 harness。本文结合实际实践，说明它在微服务、monorepo 与普通 repo 中的落地方式。

[Read more]

从现有项目抽象出 Maven Archetype

2026-04-21

#maven #archetype #spring-boot #scaffolding

从现有 Spring Boot 项目抽象出 Maven Archetype 的完整流程，包括 Velocity 转义、.gitignore 被过滤、post-generate 钩子、git-commit-id 插件等真实踩坑记录。

[Read more]

用 Maven Archetype 管理微服务 Scaffold：为什么我这里暂时没用 Spring Initializr

2026-04-21

#java #maven #archetype #microservice #scaffold #spring-boot #cloud-native

在多模块 Maven 单仓微服务平台中，我们选择了 Maven Archetype 而非 Spring Initializr 来管理服务脚手架。本文说明选型背后的原因、六类 Archetype 的分层逻辑、为什么我倾向给 Archetype 本身也加测试，以及版本维护策略。

[Read more]

vLLM 启用 Qwen3.6 的 preserve_thinking：双机 A/B 验证

2026-04-20

#AI #LLM #vLLM #Qwen #DGX Spark #Reasoning #Chat Template

Qwen3.6 随 KV cache 修复一起引入的 preserve_thinking 开关是 chat template 的 kwarg，不是 vLLM CLI 标志；借助集群两台 DGX Spark 做 A/B 对照，量化出 prompt/completion token 的差异。

[Read more]

Apple M5 上 omlx + Gemma4-26B 性能调优实录

2026-04-19

#AI #LLM #MLX #omlx #Apple Silicon #M5 #Gemma #Benchmark

基于一次 M5 本地测试，记录 MoE 模型带宽瓶颈，以及通过内存热缓存把长上下文推理提速到约 6.4 倍的过程。

[Read more]

在 macOS 上本地部署 markitdown：将任意文档转为 Markdown

2026-04-18

#markitdown #marker-pdf #macOS #Python #Document Conversion #Markdown #Local AI Tool #Table Extraction

记录在 macOS 上将 Microsoft markitdown 安装为本地的文档转换工具，支持 Word、PDF、PPT、Excel 等格式一键转 Markdown。同时介绍表格较多的 PDF 如何换用 marker-pdf 获得更好的转换效果。

[Read more]

两台 DGX Spark 跑 Qwen3.6-35B-A3B：直连 vLLM vs 经过 Gateway 的吞吐对比

2026-04-17

#AI #LLM #NVIDIA #DGX Spark #vLLM #Benchmark #Gateway #Qwen

实测两台 DGX Spark 上 Qwen3.6-35B-A3B-FP8 的 vLLM 吞吐：单机单流 ~50 tok/s，双机经 FastAPI Gateway 并发 N=16 聚合可达 ~485 tok/s。

[Read more]

本地 Kind K8s 开发环境：问题驱动的工具选择与 Tradeoff

2026-04-15

#kubernetes #kind #tilt #argocd #mirrord #spring-boot

记录自己在本地用 Kind 跑一套 Spring Boot 微服务项目时的工具选择：以问题为导向，介绍增量构建、Tilt、mirrord、ArgoCD 等工具如何解决开发痛点。

[Read more]