InstaPhoto Co 印时达
Build capability brief — layer-by-layer audit against the full 382-requirement spec
Prepared for Daphne (大福妮)
Photo · Video · GIF capture
AI matting (no green screen)
AIGC style transfer
Dye-sub photo printing
WeChat QR sharing
ShouQianBa payment
Multi-store admin
Remote device telemetry
Stack
The spec's stack vs. what I'd actually use
The spec was written assuming a mainland Chinese team would build it on Java/Spring + Vue + Aliyun. I'd build the same scope on the tooling I use day-to-day. Functionally equivalent — just different stack choices that keep the build inside what I can actually deliver well.
| Layer | Spec default | | What I'd build with | Future optional swap |
| Kiosk client | WPF / .NET 8 (C#) | → | Not in my scope — kiosk specialist hire | — |
| Backend | Java / Spring Boot + MyBatis | → | Next.js 14 + Neon Singapore + Drizzle | Aliyun PolarDB if mainland latency requires |
| Admin UI | Vue 3 + Element Plus | → | Next.js + Tailwind + PingFang SC · zh-CN primary | — |
| Hosting | Aliyun ACK / SAE | → | Vercel + custom domain (e.g. ops.instaphotoco.com) | Aliyun SAE / ECS if mainland access required |
| Object storage | Aliyun OSS mainland | → | Aliyun OSS Singapore (S3-compatible) | Aliyun OSS mainland region |
| Auth / login | Phone + SMS OTP | → | NextAuth + phone + Aliyun SMS OTP | + WeChat login |
| Payment | ShouQianBa · WeChat Pay | → | No payment integration in Phase 1 — manual payment_status field, ops updates when bank transfer arrives | WeChat Pay / ShouQianBa |
| Photo delivery | WeChat OA + Mini Program | → | Short-link QR (/p/{code}) — works on any phone, any browser, no app required | WeChat OA delivery + Mini Program |
| Ops alerts | WeChat OA template msg | → | DingTalk webhook (钉钉机器人) | + WeChat OA template msg / WeCom |
| Transactional email | — | → | Aliyun DirectMail (mainland-licensed sender) for password reset | — |
The Chinese-stack diligence point: every Phase 1 vendor choice above is selected for operational viability in mainland China. No US-only services (Stripe, Twilio, Sentry SaaS, Resend, PostHog) in the production path — those break or get filtered at venues. WeChat OA, ShouQianBa, and WeChat Pay all sit in Phase 2 — not because they're hard, but because Phase 1 should ship lean and prove the back office mechanics first. Adding them in Phase 2 is straightforward because you already have the existing accounts.
Current State
Where the engagement actually sits today — May 2026
What I'd build
Back office SaaS + REST API contract (back office ↔ kiosk) + help sourcing the kiosk specialist. I do NOT build the kiosk app itself — that's a different engineering discipline (camera/printer SDKs, real-time video) and needs the specialist hire we're working on now.
Sequencing (your call)
Sequential, not parallel. Kiosk specialist hire is secured first; back office build kicks off after that. This keeps the API contract design grounded in the real kiosk capabilities.
Where we are now
Drafting the kiosk specialist JD package — bilingual long-form JD, Boss直聘/Liepin short version, screening questions, salary band research (¥30–50K/月 FT).
Realistic Timeline
When you actually get something usable — end to end
The honest read on expectations. My 5-week back office is one slice of a longer end-to-end timeline. Guests can't use a booth until three things exist: (1) the back office (me), (2) the kiosk software running on the booth (the specialist hire), and (3) the integration between the two. Here's the realistic full picture so you can plan around it:
What Phase 1 = (back office, shippable in 5 weeks)
Your ops team gets a real internal tool the day it ships — event scheduling, brand CRM, template management, booth fleet registry, dashboards. Replaces whatever WeChat group chats + spreadsheets you run on today. That's real internal value immediately.
What Phase 1 ≠ (guests still can't use a booth)
No camera capture, no print, no QR scan for guests — because the booth-side software hasn't been built yet. The end-to-end guest-usable product requires the kiosk specialist's work on top. Don't expect a working booth in week 5.
Today
JD prep done. Kiosk specialist search begins on Boss直聘 / Liepin.
+4 to 8 weeks
Kiosk specialist identified, interviewed, hired, onboarded. Search timeline is unbounded — this is the most variable step. Call this point T0.
T0 + 5 weeks
My back office ships. Your ops team starts using it. API contract in place for the kiosk specialist to integrate against. Kiosk specialist is mid-build during this window.
T0 + 3 to 5 months
Kiosk specialist's booth software ready for integration testing. Range depends on whether the hire has shipped photo-booth software before — senior = closer to 3 months, generalist learning the domain = closer to 5+.
+ 2 to 3 weeks
Integration testing between their kiosk app and my API. First end-to-end dry-run on real hardware. Bug fixes.
≈ 5 to 7 months from today
First guest can walk up to a booth, get a photo, scan a QR, download it. End-to-end product is live.
Capability
What I can and can't build — honest layer-by-layer audit of the full spec
How to read this brief: the spec splits into 18 layers across kiosk + back office + AI + integrations. The tiers below classify each layer by my capability — what I can build, what needs a short ramp, what's genuinely new territory, and what I won't build at all (the kiosk app — that's the separate specialist hire's work). For Phase 1 contracted scope, I'm deliberately keeping it lean: back office + REST API + photo short-link delivery on a Chinese-friendly stack. WeChat OA, WeChat Pay, ShouQianBa, and Aliyun mainland deployment all sit in Phase 2 — fully buildable using your existing accounts, but kept out of Phase 1 so we ship the foundation in 5 weeks instead of fighting four integrations at once.
11
✅ Absolutely
Core competence — including WeChat + ShouQianBa with your credentials
4
🟡 Probably
Adjacent — needs 1–2 weeks of learning ramp per item
2
🟠 Maybe
Genuinely new — 3–4 weeks ramp (hardware SDKs, kiosk lockdown)
1
🔴 N/A
Kiosk app itself — not in my scope, the specialist hire owns this
0
⛔ Cannot
No partner-gated blockers — your existing mainland setup unblocks everything
From user tap to printed photo to WeChat share — color-coded by which stack layer owns each step
This product breaks cleanly into four stacks — each with its own tooling and constraints
The Chinese stack as actually shipped, with the closest Western equivalent if you were rebuilding this for, say, a US events company
01
Kiosk Framework
客户端框架
The Windows app that runs on every booth
Kiosk
Partial
🔴Probably Not → substitute
WPF / .NET 8 (C#)
Qt 6 / QML (C++)
WinUI 3
Unity (richer animation)
WPF dominates Chinese kiosk software — mature, deep Win32 access, easy DirectShow + EDSDK interop. Qt is preferred when targeting Windows + Linux (ARM) for embedded.
Electron + React
.NET MAUI
Tauri + Svelte
Unity
Western photo booth vendors (Simple Booth, Photobooth Supply Co.) often use iPad / Electron rather than Windows kiosks.
02
Camera Capture
相机采集
Live preview, burst, video, full-frame DSLR control
Hardware
Shared
🟠Maybe · 3–4 wk ramp
Canon EDSDK
Sony Camera Remote SDK
Nikon SDK
DirectShow
Media Foundation
Canon DSLRs (R-series, 5D, 6D, M-series) are the photo-booth workhorse. EDSDK handles live view, autofocus, shutter, file transfer.
Canon EDSDK
gPhoto2
Sony Camera Remote SDK
AVFoundation (Mac)
DirectShow
Same SDKs — camera vendors don't fork by geography.
03
Image Processing
图像处理
Compose template · resize · color · stickers · GIF
AI / Image
Shared
✅Absolutely · standard tooling
OpenCV
FFmpeg
ImageSharp
SkiaSharp
Magick.NET
OpenCV
FFmpeg
ImageSharp
SkiaSharp
node-canvas
Imaging libraries are universal — no regional split.
04
AI Matting (no green screen)
人像抠图 (无绿幕)
Spec calls this out: "无绿幕扣图国内AI模型"
AI / Image
Fully Divergent
🟡Probably · 1–2 wk ramp
RMBG-2.0 (local ONNX)
MODNet
BiRefNet
Aliyun ImageSeg 图像分割
Baidu AI 人像分割
Tencent Cloud Image Matting
Spec mandates "国内AI模型" (domestic models). On-device ONNX is preferred for sub-second latency without cloud round-trip; cloud as fallback.
Remove.bg API
MODNet
BiRefNet
MediaPipe Selfie Segmentation
Photoroom API
05
AIGC Style Transfer
AIGC 风格化
Spec page 8-6: user picks an AIGC effect, image is restyled
AI / Image
Fully Divergent
🟡Probably · 1 wk ramp
通义万相 Wanxiang 阿里
文心一格 ERNIE-ViLG 百度
Hunyuan-DiT 腾讯
Doubao SeedDream 字节
Liblib.ai (SDXL hosted)
SiliconFlow / 模力方舟
"国内AI模型" requirement again. Liblib.ai is the de-facto stable-diffusion-on-demand for Chinese product teams who want SDXL + LoRAs without self-hosting GPUs.
Stable Diffusion (Replicate / FAL)
OpenAI gpt-image-1
Google Imagen
RunPod self-host
Civitai LoRAs
06
Backend API
后端 API
Multi-store admin, orders, templates, devices
Backend
Partial
✅Absolutely · my core stack
Java / Spring Boot
MyBatis-Plus
Spring Cloud Alibaba
Nacos (config + discovery)
Sa-Token (auth)
Java + Spring Boot + MyBatis is the dominant Chinese SaaS pattern — virtually every WeChat-ecosystem backend looks like this.
Node.js / NestJS
Go / Gin
Python / FastAPI
Ruby on Rails
.NET 8 Web API
07
Admin Web UI
后台前端
"后台管理" sheet: stats, stores, UI assets, templates, orders, printers
Backend
Partial
✅Absolutely · admin/dashboard pattern
Vue 3 + Element Plus
Ant Design Vue
vue-admin-template
Vue-vben-admin
Pinia + Vite
vue-admin-template / vben are open-source admin scaffolds nearly every Chinese SaaS team starts with.
React + Ant Design
React + shadcn/ui
Refine.dev
Retool (low-code)
08
Database
数据库
Stores, users, orders, templates, devices
Backend
Partial
✅Absolutely · Neon + Drizzle
MySQL 8
PolarDB 阿里云
TiDB
MongoDB (template JSON)
PostgreSQL (RDS / Supabase)
MySQL
MongoDB Atlas
09
Object Storage + CDN
对象存储 + CDN
Every captured photo, GIF, video, template
Backend
Fully Divergent
✅Aliyun OSS Singapore P1 · OSS mainland P2
Aliyun OSS 对象存储
Tencent COS
Aliyun CDN + IMG service
OSS image processing (?x-oss-process=image/...) handles thumbnail / watermark / format conversion on read — saves a lot of backend code.
Amazon S3 + CloudFront
Cloudflare R2 + Images
Imgix / Cloudinary
10
Device Telemetry
设备状态上报
"调取设备运行状态、打印机剩余纸张、在线状态是否报错"
Backend
Partial
🟡Probably · MQTT new, ~1 wk
EMQX (MQTT broker)
TDengine 时序库
RocketMQ
Prometheus (DIY)
Booths report heartbeat + printer paper count + camera focus state over MQTT. TDengine is the homegrown time-series DB optimized for IoT.
AWS IoT Core
HiveMQ / EMQX Cloud
InfluxDB
Datadog devices
11
Alerting
报错推送
"报错持续微信公众号推送功能" — printer offline, jam, focus etc
Integration
Fully Divergent
✅DingTalk webhook P1 · WeChat OA template msg P2
WeChat OA Template Msg 公众号模板消息
DingTalk Robot 钉钉机器人
企业微信 Work App
Aliyun SMS
WeChat OA template message is the standard channel — operators get a push the moment a booth misbehaves.
PagerDuty
Opsgenie
Slack incoming webhooks
Twilio SMS
12
Payment
收银 / 支付
"收钱吧管理" — paid prints, promo codes
Integration
Fully Divergent
🟡Phase 1: manual payment_status field (no integration)
✅Phase 2: ShouQianBa / WeChat Pay via your existing merchant
ShouQianBa 收钱吧
WeChat Pay 服务商
Alipay 当面付
UnionPay QR
ShouQianBa is an aggregated offline-payment box (one QR for WeChat + Alipay + UnionPay) — a small fact about the Chinese SMB POS world that determines a lot of architecture.
Stripe Terminal
Square Reader SDK
Adyen POS
SumUp
13
Sharing & Follow
分享与关注
"扫码关注能够细化到不同系列中" — different campaigns → different OAs
Integration
Fully Divergent
✅Phase 1: short-link QR (/p/{code}) — works on any phone
✅Phase 2: WeChat OA scan-to-follow via your existing OA
WeChat OA 公众号
WeChat Mini Program 小程序
参数二维码 QR (followed → user) link
企业微信 客户群
Each event/series can be wired to a different OA — when users scan, the follow event arrives with a scene ID so you know which booth + which campaign drove the follow.
SMS short link (Twilio)
Email link (SendGrid)
AirDrop
Instagram / Facebook share
14
Printers (Photo)
照片打印机
Dye-sublimation printers · paper count / jam events
Hardware
Shared
🟠Maybe · DNP SDK via FFI
DNP DS620 / DS820
HiTi P525L / P720L
Citizen CX-02
Mitsubishi CP-K60DW
Same printer hardware globally — dye-sub is the only viable print tech for this use case (instant + waterproof + photo quality).
DNP · HiTi · Citizen · Mitsubishi
Fujifilm ASK-300
15
Printers (Receipt)
小票打印机
"小票机器接入" — receipt with QR + promo
Hardware
Partial
🟠Maybe · ESC/POS easiest HW
Xprinter 芯烨
Gprinter 佳博
Epson TM-T82
ESC/POS protocol
Epson TM-T88
Star Micronics TSP100
ESC/POS protocol
16
Cloud / Hosting
云服务
Where the SaaS backend lives
Backend
Fully Divergent
✅Vercel P1
🟡Aliyun P2 · 2–4 wk
Alibaba Cloud (Aliyun) 阿里云
Tencent Cloud
Aliyun ACK (Kubernetes)
Aliyun PAI-EAS (AI inference)
AWS / GCP / Azure
Fly.io / Render
Replicate / FAL (AI)
17
Monitoring & Logs
监控与日志
Backend SLA + client crash reporting
Backend
Partial
✅Console + Aliyun SLS P1 · Aliyun ARMS P2 if needed
Aliyun ARMS
Apache SkyWalking
Sentry (self-hosted)
Aliyun SLS (logs)
Datadog · New Relic
Sentry
Grafana + Loki
18
Compliance
合规
User-facing kiosks with cameras + WeChat tie-in
Integration
Fully Divergent
🟡Probably · PIPL pass Phase 2
ICP 备案
PIPL 个人信息保护法
"免责声明" page (spec page 2)
Photo retention policy
Disclaimer screen is in the spec as page 2 — required because the kiosk uploads identifiable face data to AI services. Retention timer should drop captures after N days (configurable per event).
GDPR / CCPA
Recording consent
COPPA (if minors)
A bounded Phase 1 first. Phase 2 is the optional follow-on if you want to add things later. Phase 3+ would be specialty SKUs only if there's a real customer ask.
Four pieces that aren't generic SaaS — they make or break a Chinese event-activation product
Every piece of hardware in a live event has a personality. Here's what tends to break, where the testing budget should go, and what the kiosk specialist will need to handle on their side.
Canon EDSDK / DSLR 单反相机
Critical
The whole product depends on it. EDSDK is notoriously fragile — Canon's own forums are full of "session locked, please reconnect" threads.
What breaks
- USB session lockup after ~2-4 hours of continuous live view
- Autofocus drifts on R-series mirrorless under fluorescent / mixed lighting
- Camera overheats during multi-hour video preview (worse on 5D Mark IV)
- Battery grip + cheap third-party batteries → random shutdowns; must use AC coupler (DR-E6) for events
- EDSDK live-view buffer overflows if host doesn't drain fast enough
- Different bodies (R5, R6, 5D IV, M50) expose different command subsets — code that works on one fails on another
- Firmware updates change behavior silently
Test focus: 8-hour soak test per camera model; watchdog that detects session lock and re-initializes EDSDK; AC coupler always (never battery); warm-up sequence on event start; pre-event sanity check that snaps + retrieves one frame before going live.
Dye-Sub Photo Printer 热升华打印机
Critical
DNP DS620 / HiTi P525L jam more than you'd expect. Ribbon-end mid-print = one wasted user.
What breaks
- Paper jams in humid venues (outdoor events, near beverages, summer)
- Static cling causes double-feed → user gets blank + good copy, or two blanks
- Ribbon + paper are matched pairs; loading mismatched media bricks the head
- Ribbon runs out mid-print → partial photo prints, sheet wasted
- Thermal head cooldown windows back up the queue during peak rush
- SDK behaviors differ: DNP DS-Tool, HiTi PrintAPI, Citizen SDK are not interchangeable
- Color shifts across ribbon batches → need ICC profile per batch for serious work
- Auto-cutter blade dulls after ~3000 prints → ragged edges
Test focus: jam-detection + auto-retry; pre-event print count vs ribbon capacity (alert before exhausted); media inventory tracking in admin; standardize on ONE printer model per fleet; humidity-controlled storage for media; dry-run 100 consecutive prints before every event.
On-Device AI Matting 本地抠图
Critical
Spec requires <500ms feel. Real-world variance from 180ms (RTX 4060) to 1.5s (older GPUs) breaks the flow.
What breaks
- Cold-start: first inference after idle is 3–5× slower than steady-state
- GPU memory fragments over multi-hour sessions → OOM crashes
- NVIDIA driver auto-updates change inference numerics → masks shift
- Hard edge cases the model fails on: hair against busy background, glasses + reflections, dark hoodies on dark background, multiple subjects with overlap
- DirectML vs CUDA backend differences in mask quality
- RMBG-2.0 license terms (non-commercial unless paid) — check before shipping
Test focus: standardize GPU SKU across the fleet (don't mix 3060 + 4060 + integrated); pin NVIDIA driver version; warm-up inference on app boot; failure-mode gallery (50 test images of hard cases) gated in CI; cloud fallback to Aliyun ImageSeg if local takes >2s.
Venue Network / Cloud AIGC 现场网络 / 云端AI
Critical
Event WiFi is famously terrible. The kiosk depends on the cloud for AIGC restyle, OSS upload, payment callback, and WeChat OA.
What breaks
- Venue WiFi drops every 5–15 min during exhibitions when 10,000+ phones associate
- Liblib / Wanxiang AIGC P99 latency: 8s normal, 30s+ under peak
- Domestic AIGC APIs throttle / rate-limit during national holiday events
- OSS upload of a 20MB photo over LTE = 5–20s
- WeChat OA QR generation API has its own rate limit (5000/day per OA)
- Payment callback delayed → kiosk doesn't know whether to print
Test focus: always provision an LTE / 5G backup modem with auto-failover; soak-test under 50% packet loss + 800ms latency; queue-and-retry for OSS uploads; bounded timeout on cloud AIGC with offline-fallback to print-without-AIGC; pre-generate next batch of WeChat QR codes nightly.
ShouQianBa Payment Terminal 收钱吧支付终端
High
Async callback model = many ways for the kiosk and the payment box to disagree about whether the user paid.
What breaks
- QR scan fails under fluorescent / very bright lighting
- WeChat / Alipay backbone callback latency spikes (5–30s) during peak hours
- User pays but backend never receives callback → "did it print?" disputes
- SDK / firmware updates from ShouQianBa break the integration silently
- No offline mode — venue WiFi outage = no payments
Test focus: idempotent payment polling (active query + callback both); reconciliation job every minute against ShouQianBa API; clear user-facing "checking payment..." state with timeout; staff override flow in admin for missing-callback refunds.
WeChat OA Parametric QR 公众号带参二维码
High
The whole sharing flow runs through this. WeChat platform constraints are restrictive and undocumented surprises are common.
What breaks
- Temporary QR codes expire in 30 days (max), permanent codes capped at 100k per OA
- scene_str max 64 chars → can't pack arbitrary metadata
- 48-hour service message window — miss it and you can't message the user
- WeChat-mainland-only — overseas users with non-CN WeChat accounts can't follow
- User must already be logged into WeChat on the scanning phone
- OA quota: 5000 QR generations/day per OA → multi-event days hit the wall
Test focus: per-event QR pool pre-warmed before doors open; rotate OAs in the pool when one hits quota; iOS WeChat + Android WeChat scan behavior side-by-side; HK / overseas WeChat account test cases; verified service-account domain for photo download page.
Touchscreen Display 触摸屏
High
Spec demands signature input + multi-aspect-ratio layouts. Cheap panels disagree about edge zones and palm rejection.
What breaks
- Capacitive screens drift in cold (outdoor winter events) and high humidity
- Edge-of-screen touch zones unreliable on consumer-grade panels
- Palm rejection during signature varies wildly across panel vendors
- Direct sunlight washes out the screen at outdoor venues
- 4K vs 1080p DPI scaling shifts WPF layouts
- Vertical-mount (portrait) is rarely tested by panel vendors
Test focus: qualify ONE commercial-grade touchscreen model (e.g. ELO, Philips signage); test at 16:9, 9:16, 4:3, 3:2; signature test with palm on screen; outdoor sunlight test; cold-soak test in -10°C if doing winter events.
USB / Cabling / Power USB · 线缆 · 供电
High
Camera + photo printer + receipt printer + payment + touchscreen + LTE — all share USB ports on one PC. This is where mysterious failures live.
What breaks
- Direct-to-PC USB on a busy chain drops devices randomly
- Camera + 2× printers exceeds USB hub power budget without external power
- Cable strain at events causes intermittent disconnects
- USB 2.0 hub bottlenecks live-view bandwidth from 4K cameras
- Power surges from venue mains brick photo printers (no surge protection)
- Hot-plug recovery is inconsistent across Windows builds
Test focus: ALWAYS use powered USB 3.0+ hub (qualify a specific model); UPS or at minimum surge protector on every booth; cable strain relief + clip-down; documented per-port assignment (camera on port 1, printer on port 2, etc.) so swaps are reproducible.
Receipt Printer (ESC/POS) 小票打印机
Medium
Less critical than photo printer (user can survive without a receipt) but still annoying.
What breaks
- ESC/POS "dialects" — Xprinter, Gprinter, Epson all interpret QR codes differently
- Paper roll runs out silently if no end-of-roll sensor
- Auto-cutter blade dulls and starts to chew paper
- QR codes printed too small for some phone cameras to scan
Test focus: standardize on ONE receipt printer; test QR-scan from 30cm with 3 different phones; paper change procedure documented for venue staff.
Multi-Aspect Rendering 多比例适配
Medium
Spec mandates auto-fit horizontal + vertical + multiple aspects. Easy to forget a layout.
What breaks
- Template renders correctly at 16:9 but breaks at 9:16
- Background images don't tile / cover correctly at extreme aspects
- Touch hit targets get tiny on portrait 4K
- Photo composite margins differ between camera 3:2 and screen 16:9
Test focus: render every template in the library at all 6 target aspects (16:9, 9:16, 4:3, 3:4, 3:2, 2:3) before approving; visual diff regression on every template edit.
Device Telemetry MQTT 设备状态MQTT
Medium
Has to survive flaky venue networks and reconnect cleanly.
What breaks
- MQTT TCP socket gets stuck after WiFi roam
- QoS 1 retries can flood the broker if backed up after outage
- Last Will message lets backend know booth is offline — easy to forget
- Alerts fire on transient network blips → operator alert fatigue
Test focus: chaos-test with `iptables` blocks; require N-of-M missed heartbeats before alerting; deduplicate same-cause alerts at backend.
Host PC 主机
Low
Standardize on a commercial mini-PC or industrial NUC and most issues vanish.
What breaks
- Consumer PCs throttle GPU/CPU under sustained AI inference load
- Windows Update reboots at 3am during multi-day events
- Anti-virus locks Canon EDSDK DLL or model files randomly
Test focus: qualify one PC model (e.g. NUC 13 Pro, Lenovo M90q); Windows IoT LTSC build; AV exclusions documented; disable auto-update during event window.
A handful of things I'd want answered before week 1 of the back office build, so the API contract I design matches the hardware the kiosk specialist will actually be working with.